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PROJECT  SUMMARY 


BIOLOGY-INSPIRED  AUTONOMOUS  CONTROL 

AFOSR  Laboratory  Research  Task  00MN02COR 

Johnny  Evers,  Principal  Investigator 
Flight  Vehicles  Integration  Branch,  Munitions  Directorate 
Air  Force  Research  Laboratory,  Munitions  Directorate 
Eglin  AFB,  FL 


Abstract 

The  goal  of  this  project  is  to  motivate  development  of  control  concepts  for  autonomous  munitions  that 
overcome  limitations  of  conventional  approaches  by  applying  principles  derived  from  studying  the 
biology  of  flying  organisms.  The  research  is  focused  on  understanding  the  mechanisms  of  biological 
flight  through  collaboration  with  various  experimental  biology  academic  research  laboratories  around  the 
world.  This  exploration  of  biological  flight  includes  behavior,  vision  and  other  sensory  systems,  flapping 
flight  mechanics  and  aerodynamics,  and  flight  control.  The  research  focus  addresses  two  broad, 
interrelated  research  areas:  concepts  for  aeroelastic,  propulsive  flight  inspired  by  the  biomechanics, 
aerodynamics,  sensing  and  neurobiology  of  flapping  flight  and  wide  field  sensory-response  inspired  by 
the  behavior  and  neurobiology  of  associated  with  spatial  orientation,  target  pursuit  and  navigation  in 
insects,  birds  and  bats.  With  insight  from  these  biology  studies,  the  research  seeks  to  motivate  and 
develop  new  guidance  and  control  concepts,  theory,  and  methods  for  advanced  munitions  and  micro  air 
vehicle  programs. 

Recent  Progress 

Publications  listed  below  highlight  some  of  the  research  conducted  under  this  project  over  the  past  two 
years.  The  first  paper  investigates  the  potential  for  load  sensors  on  small  air  vehicle  aerodynamic 
surfaces  to  enhance  body  platform  stability.  Two  complementary  techniques  are  explored:  one  using  body 
torque  error  to  control  actuator  position  and  the  other  using  body  force  sensing  to  compensate  for  high 
optical  feedback  latency.  The  benefits  of  responding  reflexively  to  forces  on  the  aerodynamic  surfaces 
include  low  latency,  a  reference  frame  inherently  consistent  with  the  control  actuation,  and  alleviation  of 
the  necessity  for  control  based  explicitly  on  aerodynamic  characterization.  This  paper  uses  6DOF 
simulation  to  demonstrate  the  robustness  derived  from  load  sensing  in  a  turbulent  flow  field  with  high 
levels  of  plant  uncertainty  and  optical  feedback  latency.  The  results  of  this  paper  suggest  that  direct 
sensing  of  forces  acting  on  the  body  can  significantly  enhance  the  robustness  and  performance  of  an 
attitude  control  system,  perhaps  giving  insight  into  how  natural  systems  can  fly  with  high  levels  of 
damage,  coarse  sensors,  and  large  sensorimotor  information  processing  latencies. 

The  second  paper  is  motivated  by  a  desire  to  develop  analytical  formulations  for  cooperative  defensive 
strategies  against  predator(s).  A  single-pursuer,  two-evader  differential  game  with  a  novel  cost  functional 
is  formulated.  Each  of  the  three  agents  are  modeled  as  mass  less  particles  that  move  with  constant 
velocity.  The  pursuer  attempts  to  capture  either  of  the  evaders  while  minimizing  its  cost.  Simultaneously, 
the  evaders  strive  to  maximize  the  pursuer’s  cost.  The  proposed  cost  functional  represents  the  increased 
cost  to  the  pursuer  when  presented  with  multiple,  potentially  dangerous  targets.  It  captures  the  effect  of 
cooperation  between  the  evaders.  In  order  to  solve  the  game,  optimality  conditions  for  the  equilibrium 
strategies  are  developed.  The  resulting  system  of  ordinary  differential  equations  is  then  integrate 
backwards  in  time  from  the  terminal  conditions  to  generate  the  optimal  trajectories  of  the  three  agent 
system.  The  resulting  trajectories  display  cooperative  behaviors  between  the  two  evaders,  which  are 
qualitatively  similar  to  behaviors  found  in  predator-prey  interactions  in  nature.  A  brief  description  of 
singular  surfaces  is  also  included. 
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The  third  paper  discusses  the  collection,  post-processing  and  subsequent  evaluation  of  flight  data  of 
butterflies,  in  various  free  flight  scenarios  in  a  quasi-natural  environment.  A  vision  tracking  system  is 
used  to  obtain  the  flight  data.  This  in  turn  is  used  to  determine  estimates  of  the  motion  of  different  body 
parts  of  the  insect,  including  the  abdomen  and  the  wings.  These  estimates  are  subsequently  analyzed  with 
a  view  to  establishing  the  manner  in  which  the  insect  adapts  the  motion  of  its  abdomen  to  work  in  tandem 
with  the  motion  of  its  wings.  Furthermore,  the  manner  in  which  this  adaptation  changes  through  different 
flight  phases  is  studied. 

The  fourth  paper  explores  the  issues  of  control  of  aeroelastic  wing  micro  autonomous  aerial  systems. 
Controllers  designed  using  methods  applicable  to  larger  aircraft  are  unlikely  to  realize  the  agile  flight 
potential  of  flexible  wing  micro  autonomous  aerial  systems  airframes.  In  this  paper,  two  Euler-Bemoulli 
beams  connected  to  a  rigid  mass  represent  a  conceptual  model  of  an  aeroelastic  wing  micro  autonomous 
aerial  system.  Continuous  Sensitivity  Equation  Methods  are  employed  to  examine  the  sensitivity  of  the 
controlled  state  with  respect  to  variation  of  the  HInfmity  control  parameter,  with  the  primary  goal  being  to 
gain  insight  into  the  flexible  dynamics  of  the  system  in  order  to  exploit  the  flexibility  for  control 
purposes.  The  paper  further  examines  functional  gains  in  order  to  determine  optimal  sensor  placement 
while  taking  advantage  of  the  flexibility  of  the  micro  autonomous  aerial  systems  model. 

The  final  paper  of  this  collection  addresses  some  of  the  technical  challenges  associated  with  development 
of  bird  or  insect  size  micro  autonomous  aerial  systems.  It  takes  the  perspective  that  agile  micro 
autonomous  aerial  systems  with  their  layers  of  human  supervision  represent  complex,  highly  nonlinear 
multi-scale  dynamical  systems.  After  a  brief  discussion  of  some  issues  of  scale  for  such  systems  and 
current  research  investigating  those  issues,  the  paper  focuses  on  the  idea  of  autonomy  associated  with 
multi-scale  dynamical  systems.  Agile  micro  autonomous  aerial  systems  currently  exist  only  in  nature  (i.e., 
insects,  birds,  bats).  Consequently,  the  paper  considers  autonomy  in  manmade  micro  autonomous  aerial 
systems  from  a  biological  perspective.  It  introduces  a  conjecture  that  functional  system  characteristics 
associated  with  the  capabilities  of  living  flying  organisms  may  require  levels  of  response  variation  and 
flexibility  that  are  not  associated  with,  and  perhaps  will  not  be  tolerated  in  manmade  critical  systems. 
Although  this  paper  does  directly  address  questions  of  ethics  associated  with  the  deployment  of  critical 
autonomous  systems,  it  attempts  to  provide  some  insight  into  how  those  important  questions  may 
naturally  emerge  when  any  degree  of  robustness  is  imposed  as  a  design  criterion  for  manmade  agile 
autonomous  systems. 
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Attitude  Control  Augmentation  Using  Wing  Load  Sensing  -  A 
Biologically  Motivated  Strategy 

Rhoe  A.  Thompson*  Johnny  H.  Evers*  Kelly  C.  Stewart* 

AFRL/RW,  Eglin  AFB,  FL ,  32542,  USA 


Many  flying  animals  are  able  to  achieve  highly  robust  flight  without  feedback  from  dedicated  angular  rate 
sensors.  In  general,  these  animals  use  their  vision  systems  to  provide  attitude  rate  and  orientation  information. 
Limitations  of  vision  based  measurements  for  stabilizing  the  body  include  the  high  level  of  latency  incurred  in 
the  visual  processing  system  and  the  need  to  maintain  some  level  of  ocular  isolation  in  order  to  achieve  adequate 
image  quality.  This  paper  investigates  the  potential  for  load  sensors  on  the  aerodynamic  surfaces  to  enhance 
body  platform  stability.  Two  complementary  techniques  are  explored:  one  using  body  torque  error  to  control 
actuator  position  and  the  other  using  body  force  sensing  to  compensate  for  high  optical  feedback  latency.  The 
benefits  of  responding  reflexively  to  forces  on  the  aerodynamic  surfaces  include  low  latency,  a  reference  frame 
inherently  consistent  with  the  control  actuation,  and  alleviation  of  the  necessity  for  control  based  explicitly 
on  aerodynamic  characterization.  This  paper  uses  6DOF  simulation  to  demonstrate  the  robustness  derived 
from  load  sensing  in  a  turbulent  flow  field  with  high  levels  of  plant  uncertainty  and  optical  feedback  latency. 
The  results  of  this  paper  suggest  that  direct  sensing  of  forces  acting  on  the  body  can  significantly  enhance  the 
robustness  and  performance  of  an  attitude  control  system,  perhaps  giving  insight  into  how  natural  systems  can 
fly  with  high  levels  of  damage,  coarse  sensors,  and  large  sensorimotor  information  processing  latencies. 


Nomenclature 


Azimuth  or  Yaw  Euler  Angle 

idn 

Natural  Frequency 

e 

Elevation  or  Pitch  Euler  Angle 

J 

Moment  of  Inertia 

<t> 

Bank  or  Roll  Euler  Angle 

c 

Damping  Ratio 

V 

Inertial  Velocity  Magnitude 

Kp,Kd 

Proportional  and  Derivative  Control  Gains 

\p,  q,  r ] 

Body  Angular  Rate  Components 

Kt 

Gain  Associated  with  Torque  Feedback 

b 

Reference  Lateral  Length 

A t0pt 

Optical  Feedback  Latency 

c 

Reference  Longitudinal  Length 

Measured  Angle  and  Torque  States 

LQR 

Linear  Quadratic  Regulator 

PD 

Proportional  Derivative  Control 

MAV 

Micro  Air  Vehicle 

PID 

Proportional  Integral  Derivative  Control 

6  DOF 
GenMAV 

Six  Degree  of  Freedom  Simulation 
Generic  Micro  Air  Vehicle 

PDF 

Proportional  Derivative  Torque  Control 

I.  Introduction 

Insects  are  commonly  used  as  research  subjects  for  flight  control  physiology  studies  due  to  the  reduced  complex¬ 
ity  of  their  morphology,  physiology,  and  behavioral  response.  The  ability  of  insects  to  perform  precision  navigation 
is  also  widely  studied. 1  2  Flying  insects  are  abundant  and  readily  available,  and  they  are  considered  models  for  the 
characteristics  desired  in  man-made  micro-air  vehicles. 3  4  Insects  robustly  deal  with  damage  to  their  bodies  and  uncer¬ 
tainty  in  their  environments.  They  are  adaptable,  autonomous,  and  can  readily  change  behavioral  objectives.  Insects, 
in  all  of  their  various  forms,  have  a  wide  array  of  discrimination  and  target-tracking  capabilities,  using  optical,  acous¬ 
tic  and  chemo-receptive  modalities.5  6  7  All  flying  insects  appear  to  take  advantage  of  optical  rate  feedback  in  their 
flight  control  systems.  Insects  of  the  order  Diptera,  flies,  also  use  mechanoreceptive  angular  rate  feedback  from  the 
halteres. 8  9 10  Those  insects  having  only  optical  rate  feedback  are  capable  of  remarkable  flight  performance.  Given  the 
amount  of  latency  inherent  in  the  optical  feedback  pathways,  the  specific  mechanisms  through  which  flight  stability  is 

*rhoe. thompson@eglin.af.mil,  AFRL/RWGG,  AIAA  Member 
ljohnny.evers@eglin.af.mil,  AFRL/RWAV,  AIAA  Member 
tkelly.stewart@eglin.af.mil,  AFRL/RWGN,  AIAA  Member 
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achieved  remain  unclear. 11  It  is  this  characteristic  that  is  the  motivation  for  the  work  described  in  this  paper. 

The  intent  of  this  research  is  to  understand  the  benefits  of  load  sensing  on  aerodynamic  surfaces  for  attitude 
stabilization.  The  bodies  of  animals  are  sensor  rich.  Strain  sensors  that  respond  to  internal  and  external  forces  on  the 
exoskeleton  are  common  if  not  universal. 12 13  In  addition  to  having  influence  on  high-level  behaviors,  these  sensors 
have  evolved  to  provide  low-latency  reflexive  response  as  well.  The  wings  of  insects  have  cuticular  strain  sensors, 
referred  to  as  campaniform  sensilla,  distributed  along  the  structural  veins,  as  well  as  chordotonal  organs  that  stretch  and 
respond  to  motion  of  the  wing  hinge. 14  These  sensors  encode  magnitude  of  the  wing  load  through  species-dependent 
mechanisms. 15  The  pathways  to  the  wing  control  muscles  are  short,  with  low  latency,  leading  to  speculation  that  they 
are  directly  involved  in  flight  control. 16  Given  the  relatively  high  level  of  latency  involved  in  rate  feedback  from  the 
insect  visual  system,  it  is  likely  that  the  wing  load  sensors  play  a  direct  role  in  attitude  stabilization.  This  role  is 
especially  indicated  in  natural  systems  that  do  not  have  a  direct,  low-latency  means  of  measuring  angular  rate  attached 
to  the  main  body,  i.e.,  halteres  or  other  gyroscopic  organs. 

The  point  of  departure  for  this  activity  was  a  subsequently  discarded  hypothesis:  strain  sensed  on  the  wings  is 
proportional  to  angular  rate  in  the  body  frame.  Therefore,  by  reacting  to  wing  strain,  a  winged  vehicle  could  apply 
a  dissipative  damping  force  that  ensures  attitude  stability.  The  origin  of  this  thought  process  was  the  understanding 
that  a  steady  state  roll  motion  would  induce  a  differential  angle  of  attack  on  the  wings  proportional  to  roll  rate.  This 
differential  angle  of  attack  would  in  turn  result  in  a  differential  force,  or  roll  damping,  on  the  wings  which  might  be 
sensed  and  controlled.  Therefore,  the  differential  wing  load  would  be  proportional  to  roll  rate.  While  the  described  rate 
damping  effects  are  very  real,  the  inability  to  separate  other  dynamic  causal  effects,  e.g.,  control  surface  deflection  and 
transient  gusts,  from  the  steady  state  mechanism  hypothesized  was  felt  to  be  insurmountable.  Alternative  mechanisms 
were  therefore  pursued. 
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Figure  1.  a)  Baseline  attitude  control  without  load  feedback.  Actuators  assumed  have  a  first  order  open-loop  response,  b)  Attitude  with 
load  feedback  to  optimize  responsiveness  of  the  attitude  controller  and  to  reject  disturbances  and  errors  in  actuated  body  torque.  The 
closed-loop  actuator  response  is  modeled  as  a  damped  second  order  system. 

In  its  simplest  form,  the  attitude  control  of  a  flight  vehicle  can  be  described  as  in  Figure  la.  Angular  rate  and 
orientation  resulting  from  the  system  dynamics  are  sensed,  and  the  measurements  are  fed  back  into  the  attitude  control 
system.  The  attitude  control  system  then  commands  a  control  surface  response,  intended  to  produce  a  torque  on  the 
body  in  order  to  reduce  attitude  state  errors.  There  are  two  fundamentally  distinct  ways  in  which  strain  measurements 
might  influence  the  attitude  control  design:  through  regulation  of  the  actuation  commands  sent  to  the  control  surfaces 
and  through  augmentation  of  the  attitude  controller.  Figure  lb.  In  the  first  way,  the  measured  error  in  the  body  torque 
achieved  by  the  vehicle  control  surfaces  can  be  driven  to  zero  using  the  actuators.  The  source  of  this  error  might  be  un¬ 
certainty  in  the  plant  characteristics  or  a  torque  disturbance  on  the  body  from  external  sources  such  as  turbulence.  The 
second  way  that  load  sensing  might  be  used  in  the  control  system,  direct  use  in  the  attitude  control  formulation,  has 
multiple  possibilities  as  well.  Attitude  control  systems  normally  include  proportional  and  integral  control  on  sensed 
attitude,  with  damping  and  robustness  provided  through  rate  feedback.  A  disturbance  force  acting  on  the  body  must 
produce  body  angular  rate  before  the  controller  moves  the  control  effectors  to  cancel  it.  If  some  direct  measure  of 
angular  acceleration  could  be  sensed,  the  attitude  controller  could  potentially  obtain  a  more  optimal  tracking  response. 
In  addition,  if  torque  was  estimated  from  the  strains  sensed,  with  knowledge  of  the  inertia,  a  low  latency  measure  of 
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angular  acceleration  could  be  obtained.  With  this  estimate,  high-latency  optical  angular  rate  feedback,  or  phase  error, 
might  be  directly  mitigated  to  a  first  order. 


II.  Model  and  Simulation  Description 


Figure  2.  Attitude  control  demonstrations  were  based  on  a  model  of  the  AFRL  GenMAV  vehicle.  Aerodynamic  coefficients  were  calculated 
using  AVL.  Models  assumed  a  configuration  with  ailerons,  not  shown  in  the  hardware  depicted. 


The  specific  mechanisms  by  which  load  sensing  is  used  in  natural  systems  for  flight  stabilization  are  not  known.  To 
demonstrate  potential  applications  and  the  associated  benefits,  a  model  of  the  Air  Force  Research  Laboratory  (AFRL) 
developed  Generic  Micro-Air- Vehicle  (GenMAV)  was  employed,  as  shown  in  Figure  2.  This  choice  avoided  the 
complexity  of  modeling  a  flapping  wing  system,  allowing  for  more  straightforward  conceptualization  of  engineering 
applications,  while  still  providing  direct  insight  into  potential  biological  mechanisms. 

GenMAV  is  a  conventionally-shaped  air  vehicle  with  a  high- wing  configuration,  a  wingspan  of  24"  and  a  chord 
of  5".  It  has  a  conventional  tail  with  a  horizontal  surface  of  12"  and  a  vertical  surface  of  4.6".  The  fuselage  is  16.5" 
in  length  and  approximately  3  "  in  diameter  at  its  widest  point.  GenMAV  is  a  bank-to-turn  vehicle  controlled  by  a  pair 
of  elevons  that  make  up  50%  of  the  chord  on  the  horizontal  stabilizer.  Its  body  and  wings  are  comprised  of  carbon 
fiber  with  enough  layers  to  ensure  adequate  rigidity.  For  this  investigation,  the  GenMAV  is  modeled  with  conventional 
ailerons,  elevator,  and  rudder,  a  different  control  configuration  from  the  actual  hardware  design.  GenMAV  was  de¬ 
veloped  as  a  reference  vehicle  for  research  conducted  within  and  outside  of  AFRL. 17  The  generic  design  is  based  on 
several  iterations  of  MAVs  previously  studied  in  AFRL  and  provides  a  convenient  baseline  from  which  various  MAV 
technologies  can  be  explored. 18 

Control  system  modeling  of  the  flight  vehicle  was  accomplished  in  the  Matlab  Simulink™environment.  The  6DOF 
simulation  environment  was  constructed  using  a  direct  implementation  of  the  quaternion  dynamics  model  documented 
by  Phillips. 1920  To  provide  aerodynamic  disturbances,  the  continuous  Dryden  turbulence  model  within  the  Aerospace 
Blockset  was  used,  with  the  wind  speed  parameter  set  to  approximately  10  percent  of  the  MAV  ground  speed.  Char¬ 
acterization  of  the  GenMAV  vehicle,  in  order  to  provide  an  aerodynamic  truth  model,  was  accomplished  with  the 
Athena  Vortex  Lattice  (AVL)  code.  AVL  was  developed  by  Harold  Youngren  of  MIT,  and  subsequently  by  Mark  Drela 
(also  of  MIT)  to  provide  aerodynamic  and  flight-dynamic  analysis  of  rigid  aircraft  with  arbitrary  configurations.21 
The  program  applies  thin  airfoil  theory  to  predict  the  inviscid  aerodynamic  forces  and  moments  acting  on  the  lifting 
surface  of  an  air  vehicle.  Thin  airfoil  theory  approximates  the  airfoil  as  a  combination  of  uniform  flow  and  a  vortex 
sheet  placed  along  the  camber  line.  This  leads  to  the  aerodynamic  force  and  moment  being  primarily  a  function  of 
angle  of  attack  and  camber  line  geometry.  Based  on  the  assumptions  behind  thin  airfoil  theory,  AVL  is  best  suited  for 
applications  involving  thin  lifting  surfaces,  i.e.,  maximum  thickness  of  12%  chord  or  less,  at  small  angles  of  attack  and 
sideslip.  In  AVL,  the  lifting  surfaces  of  an  aircraft  are  modeled  as  single-layer  vortex  sheets  discretized  into  horseshoe 
vortex  filaments.  Flow  is  assumed  to  be  quasi-steady  and  within  the  limit  pertaining  to  small  reduced  frequency.  This 
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translates  into  the  following  limits  for  each  of  the  dimensionless  flow  rate  parameters: 
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Given  that  thin  airfoil  theory  deals  with  2-D  potential  flow,  drag  due  to  viscous  effects  is  not  calculated  in  AVL, 
and  the  lift  coefficient  is  a  linear  function  of  angle  of  attack.  Overall  drag  is  represented  as  a  combination  of  lift- 
induced  drag  plus  an  approximation  for  parasitic  effects.  In  addition  to  static  coefficients,  AVL  provides  damping 
coefficients,  including  the  coupled  terms  between  roll  and  yaw,  and  control  surface  derivatives.  The  full  complement 
of  aerodynamic  coefficients  was  used  for  this  work. 

For  demonstration  of  the  benefits  of  torque  feedback,  the  attitude  control  was  implemented  using  three  independent 
PD  controllers;  body  rate  and  attitude  error  were  assumed  to  be  optically  observable.  Pitch  angle  was  used  directly 
to  control  altitude  error.  The  outer  altitude  control  loop  was  in  the  form  of  PID  control,  allowing  the  integral  term 
to  account  for  gravity  bias.  The  attitude  loops  were  tuned  to  respond  as  critically  damped  second  order  systems  with 
nominally  a  5  Hz  natural  frequency.  Attitude  control  output  was  in  the  form  of  a  torque  command  for  each  body  axis; 
i.e.. 


Commanded  Torque  =  J9 

=  -  JtVn(0m  -  Ocom)  -  J2C,Wn0rn  (1) 

F  p  (  0  try  0corn  )  -  I<d0  m  i 

where  Kp  and  Kd  are  the  proportional  and  derivative  control  gains,  0  is  an  angular  degree  of  freedom  with  an  associ¬ 
ated  inertia  J.  The  desired  damping  ratio  and  natural  frequency  are  represented  by  £  and  uin,  respectively. 
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Figure  3.  Actuator  response  models  used  for  load  feedback  6DOF  demonstrations:  a)  first  order  response  model,  b)  second  order  torque 
error  regulator.  The  nominal  response  time  constants  for  both  models  were  defined  to  be  17  ms  as  shown  in  c).  In  a)  and  b ),  6  represents 
the  achieved  control  surface  deflection  and  FT /a  represents  a  nominal  scale  factor  to  convert  torque  to  control  surface  deflection  angle. 


A  nominal  first  order  control  surface  actuator  response  model  was  used  for  baseline  comparison.  Alternatively, 
closed-loop  actuators,  which  used  measured  body  torque  error  to  drive  the  ailerons,  elevator,  and  rudder,  were  modeled 
as  second  order  damped  torque  motors  with  angular  limits  at  +/-  30  degrees.  The  natural  frequency  of  the  second 
order  actuator  control  loop  was  nominally  20  Hz,  having  a  17  ms  time  constant  for  63%  response.  The  first  order 
actuator  was  also  defined  with  a  17  ms  time  constant  for  consistency.  Both  actuator  response  models  used  a  common 
scale  factor.  Ft/a ,  derived  around  straight  and  level  flight  conditions  to  convert  torque  to  angle.  Figure  3.  Under 
nominal  conditions,  to  the  degree  that  the  second  and  first  order  responses  were  similar,  the  airframe  response  would  be 
expected  to  be  similar.  Under  conditions  of  degraded  control  realization  and  non-zero  latency,  significant  differences 
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would  be  expected  due  to  the  inability  of  the  controller  without  torque  feedback  actuation  to  reflexively  respond  to 
errors  and  disturbances. 

To  demonstrate  the  potential  benefit  from  augmenting  the  attitude  controller  with  wing  load  feedback,  a  formu¬ 
lation  was  derived  to  use  the  assumed  low-latency  measurement  of  body  torque  to  compensate  for  the  destabilizing 
influence  of  high  latency  in  the  optical  feedback  pathway.  To  derive  the  control  expression  used,  both  inertia  and 
feedback  latency  are  assumed  to  be  known  to  some  approximation.  The  control  gains  associated  with  measured  angle, 
0rn  ,  angular  rate,  9m,  and  torque,  Tm,  are  required: 

Commanded  Torque  =  -Kp(9m  -  9com)  -  Kd0m,  ~KtTm. 

These  new  gains  are  found  in  terms  of  the  gains  in  (1),  the  latency,  Atopt,  and  the  inertia,  J,  using  simple  Taylor 
series  approximations: 

9est  =  Om  H —At opt  (2) 

Oest  =  9m  +  ifim  H - —  Atopt) Atopt  +  -^-jAt^pt 

3T 

=  9m  +  9  m At  opt  H — — —Atopt .  (3) 

Substituting  the  estimates  represented  by  (2)  and  (3)  for  the  measured  quantities  in  (1),  then  collecting  terms, 
provides  the  following  expressions  for  the  new  gains: 


Kp  =  Kp 


Kd  —  {KpAt0pt  +  Kd ) 


Kt  =  K„ 


3A  tl 


opt 


2  J 


I<d- 


At 


opt 


J 


(4) 

(5) 

(6) 


Efforts  to  develop  an  optimal  attitude  control  law  using  body  force  states  to  minimize  a  cost  function,  as  in  LQR, 
were  not  complete  at  the  time  of  this  publication. 


III.  Results 

Two  test  cases  were  devised  to  demonstrate  the  benefit  of  load  sensing  on  dynamic  performance.  Both  test  cases 
included  the  turbulence  model  previously  described  and  an  initial  one  meter  step  command  in  altitude.  The  first  test 
case  demonstrated  performance  with  and  without  degraded  control  response.  To  model  the  degraded  control  response, 
the  angular  response  of  all  control  surfaces  was  cut  in  half,  thereby  modeling  a  50%  degradation  in  control  surface 
effectiveness.  Linear  forces  through  the  center  of  gravity  were  not  degraded. 

Figures  4a  and  4b  show  the  roll  and  yaw  attitude  in  the  presence  of  turbulence  and  a  one  meter  altitude  step  com¬ 
mand  with  no  torque  feedback.  The  two  time  histories  represent  cases  with  and  without  control  system  degradation. 
Figures  4c  and  4d  show  the  same  two  cases  with  torque  feedback  to  control  actuator  position.  The  cases  with  torque 
feedback  to  the  actuator  show  roughly  an  order  of  magnitude  better  disturbance  rejection.  Both  with  and  without  torque 
feedback,  an  increase  in  attitude  response  is  seen  as  a  result  of  the  control  surface  degradation.  Figure  5  demonstrates 
the  increased  response  of  the  rudder  and  aileron  in  the  presence  of  control  degradation  without  torque  feedback,  a)  and 
b),  and  with  torque  feedback,  c)  and  d).  The  control  surface  positions  are  similar  in  character  with  increased  amplitude 
for  the  torque  feedback  control.  As  demonstrated  by  Figure  4,  the  closed-loop  actuator  more  effectively  dealt  with  the 
deviations  from  the  commanded  body  torques. 

The  second  test  case  involves  introduction  of  latency  into  the  state  feedback  that  drives  the  attitude  control  law.  In 
animal  systems,  in  particular  those  that  do  not  have  highly  dedicated  rate  sensing  physiology,  optical  flow  provides  a 
primary  means  for  sensing  angular  motion.  In  insects,  the  neuronal  processing  of  vision  motion  may  introduce  30  ms 
or  more  latency  into  the  feedback  process,  depending  on  species  and  ambient  light  level.  This  delay  would  be  expected 
to  have  a  detrimental  impact  on  the  attitude  control  system.  The  strain  mechanosensors  typically  have  a  much  more 
direct  pathway  to  the  muscles  that  they  stimulate.  Campaniform  sensilla,  the  load  sensors  on  insects,  can  induce  a 
response  in  the  muscles  in  an  order  of  magnitude  less  time  than  that  achieved  by  the  vision  system.  The  haltere  to 
motor  neuron  pathway  in  dipertan  insects  is  a  well  characterized  example  of  this  type  of  quick  reflexive  response.  For 
this  test  case,  a  latency  of  3  ms  on  the  torque  feedback  and  10  ms  on  the  optical  feedback  was  sufficient  to  demonstrate 
the  benefit  of  closed-loop  torque  regulation. 
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a. 

Yaw  Euler  Angle  With  NO  Torque  Feedback 
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b. 

Roll  Euler  Angle  With  NO  Torque  Feedback 
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C. 

Yaw  Euler  Angle  WITH  Torque  Feedback 
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Roll  Euler  Angle  WITH  Torque  Feedback 
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Figure  4.  Euler  angle  response  of  GenMAV  without  and  with  torque  feedback  to  actuators  for  the  case  of  50%  degradation  in  control 
capability,  a)  Yaw  Euler  angle  with  open-loop  rudder  control,  b)  Roll  Euler  angle  with  open-loop  aileron  control,  c)  Yaw  Euler  angle  with 
torque  feedback  to  rudder  control,  d)  Roll  Euler  angle  with  torque  feedback  to  aileron  control. 
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Time(sec) 
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Aileron  Angle  With  NO  Torque  Feedback 
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Figure  5.  Rudder  and  aileron  control  surface  response  for  the  case  of  50%  degradation  in  control  capability  (Figure  3).  a)  Open-loop 
rudder  response  and  b)  open-loop  aileron  response,  c)  and  d)  show  response  of  the  rudder  and  aileron  with  closed-loop  actuator  control. 
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Figures  6a  and  6b  demonstrate  that,  without  torque  feedback,  a  highly  oscillatory  yaw  and  roll  attitude  response 
results  in  the  presence  of  10  ms  of  latency.  This  response  is  stimulated  by  turbulent  disturbances.  Note  that  the 
bandwidths  of  the  attitude  loops  were  tuned  for  the  ideal  zero  latency  case.  In  contrast.  Figures  6c  and  6d  show  the 
same  comparison  with  the  torque  feedback  to  the  actuators.  Figure  7  shows  the  corresponding  control  surface  angles 
for  the  two  cases  from  Figures  6c  and  6d.  The  two  curves  in  this  figure  depict  a  very  similar  response.  The  fact  that 
actuator  response  does  not  change  significantly  in  the  presence  of  the  optical  feedback  latency  indicates  that  the  torque 
feedback  loop  is  primarily  responsible  for  mitigating  the  effect  of  the  turbulence  induced  torque  disturbances.  The 
improved  disturbance  rejection  delays  the  onset  of  system  oscillation  as  latency  increases. 
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Figure  6.  Euler  angle  response  of  GenMAV  without  and  with  torque  feedback  to  the  actuators  for  the  case  of  3  ms  torque  feedback  latency 
and  10  ms  attitude  state  feedback  latency,  a)  Yaw  Euler  angle  with  open-loop  rudder  control,  b)  Roll  Euler  angle  with  open-loop  aileron 
control,  c)  Yaw  Euler  angle  with  torque  feedback  to  rudder  control,  d)  Roll  Euler  angle  with  torque  feedback  to  aileron  control. 


Rudder  Angle  WITH  Torque  Feedback 


Time(sec) 


Aileron  Angle  WITH  Torque  Feedback 


Figure  7.  Rudder  and  aileron  control  surface  response  for  the  case  of  3  ms  torque  feedback  latency  and  10  ms  attitude  state  feedback 
latency  (Figures  6c  and  6d).  a)  Open-loop  control  and  b)  torque  regulated  control. 

As  latency  increases  further,  even  with  torque  feedback  to  the  actuator,  the  system  begins  to  destabilize  as  shown 
in  Figure  8a.  In  this  example,  30  ms  of  latency  was  simulated.  Figure  8b  shows  the  response  with  the  PD  attitude 
controller  augmented  with  torque  to  compensate  for  the  latency  (PDT  control),  reference  equations  (4)-(6).  In  this  case, 
the  system  responds  with  some  pitch  oscillation  in  response  to  the  one  meter  step  in  altitude,  but  quickly  stabilizes, 
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closely  matching  the  zero  latency  baseline  case.  Note  that  the  motion  that  dominates  in  all  cases  described  is  pitch 
motion.  There  is  no  mechanism  in  the  defined  control  scheme  to  respond  to  measured  z-force  error  except  through 
pitch  control.  The  resulting  motion  to  affect  a  decrease  in  altitude  error  is  much  larger  than  the  residual  motion  in 
yaw  and  roll.  In  a  flapping  wing  design,  where  z-force  could  be  controlled  independently,  this  coupling  of  degrees  of 
freedom  would  not  necessarily  be  required. 


PD  Attitude  Control  b  PD-T  Attitude  Control 


Time(sec)  Time(sec) 


Figure  8.  Pitch  response  for  the  case  of  30  ms  optical  latency  and  closed-loop  actuator  control.  Figure  a)  shows  the  highly  oscillatory 
nature  of  the  response  without  latency  compensation.  Figure  b)  shows  the  improved  response  with  torque  augmented  attitude  control  to 
compensate  for  the  latency. 


IV.  Discussion 

The  results  in  this  paper  demonstrate  the  ability  of  wing  load  sensors  to  improve  attitude  control  robustness  and 
disturbance  rejection  in  the  presence  of  significant  uncertainty  in  plant  characteristics.  By  directly  measuring  the 
torque  around  a  given  axis,  the  error  with  respect  to  the  torque  commanded  by  the  attitude  control  law  can  be  reduced 
without  dependence  on  known  aerodynamic  characteristics  of  the  airframe.  Through  the  same  mechanism,  torque 
disturbances  on  the  airframe  can  be  dealt  with  through  high  speed  reflexive  control,  leaving  the  outer  attitude  control 
loop  to  deal  with  lower  frequency  optical  tracking.  This  wing  load  feedback  mechanism  may  explain  the  robustness 
of  insect  flight,  where  significant  damage  to  wings  is  tolerated  and  high  variations  in  control  performance  occur  due 
to  such  factors  as  temperature,  age,  individual  variation,  and  metabolic  state. 

The  results  shown  in  this  paper  rely  on  many  assumptions  with  respect  to  sensor  implementation.  Actual  imple¬ 
mentation  of  strain  sensors  may  be  a  significant  hurdle  to  realization  of  the  results  shown.  Nature  has  evolved  systems 
that  rely  on  large  numbers  of  simple  sensors  spread  throughout  the  body  structure  to  sense  and  respond  to  interaction 
with  the  environment.  While  individual  sensors  may  be  very  poor  detectors  of  magnitude,  in  concert,  a  number  of 
simple  sensors  spread  throughout  a  structure  might  have  a  very  large  effective  dynamic  range.  The  human  engineering 
approach  is  to  build  more  elaborate  sensors  that  individually  achieve  the  required  dynamic  range.  Inherent  in  these 
approaches  is  a  trade-off  between  integration  complexity  and  robustness  to  damage.  If  one  of  the  many  simple  sen¬ 
sors  associated  with  the  insect  wing  is  not  functional,  a  small  price  is  paid  in  terms  of  overall  dynamic  range,  but  the 
system  still  functions.  If  the  single,  more  elaborate  sensor  on  the  man-made  system  is  damaged,  the  result  might  be 
catastrophic.  Significant  engineering  development  in  materials  and  manufacturing  technology  would  be  required  to 
duplicate  the  design  paradigm  that  is  prevalent  in  natural  systems.  However,  similar  performance  characteristics  might 
be  achievable  by  mimicking  a  low-latency,  load-based  control  mechanism  without  duplicating  the  sophistication  in 
materials  and  manufacturing  seen  in  nature. 

To  achieve  the  results  shown,  either  strain  sensors  would  have  to  be  placed  near  the  body  on  the  aero  surfaces, 
or  the  angular  acceleration  of  the  body  would  need  to  be  measured  directly.  With  knowledge  of  the  body  inertia 
characteristics,  the  net  torques  on  the  body  could  be  deduced  from  the  angular  accelerations.  The  feedback  to  the 
modeled  closed-loop  actuator  is  an  assumed  estimate  of  the  net  torque  on  the  body.  The  calculations  and  calibrations 
required  to  realize  actual  quantitative  estimates  of  torque  around  the  center  of  gravity  could  become  very  elaborate. 
Clearly,  nature  is  not  explicitly  relying  on  quantitative  estimates  of  torque  around  the  center  of  gravity.  Natural 
designs  take  maximum  advantage  of  symmetry  and  the  differential  effect  of  control  force  application  across  the  plane 
of  symmetry.  In  fact,  strain  sensors  on  the  left  half  of  the  body  may  influence  left  wing  control,  while  sensors  on 
the  right  half  influence  right  wing  control,  without  a  significant  contralateral  influence,  as  in  the  halteres  of  flies.  The 

8  of  10 


American  Institute  of  Aeronautics  and  Astronautics 


net  effect  of  the  forces  would  still  result  in  a  stabilizing  influence.  The  primary  requirement  is  a  signal,  related  to 
the  net  torque  on  the  body,  that  can  be  driven  to  zero  or  to  a  correctly  biased  state,  thereby  driving  the  torque  error 
to  zero.  This  should  be  achievable  with  symmetric  placement  of  strain  sensors  on  the  right  and  left  aero  surfaces. 
Compensation  for  residual  biases  can  be  realized  through  appropriate  application  of  controller  integral  terms  and  the 
aerodynamic  stability  inherent  in  the  basic  design. 

The  fundamental  idea  of  using  feedback  to  eliminate  uncertainty  in  the  output  of  an  actuator  is  not  new.  For 
example,  hydraulic  torque  motors  are  sometimes  implemented  with  a  pressure  loop  around  the  valve  to  reduce  the 
impact  of  hydraulic  resonance.  The  same  technique  is  more  generally  used  to  eliminate  nonlinearities  in  open-loop 
actuator  response.  Treating  the  entire  airframe  as  a  torque  motor  with  the  error  driving  the  control  surface  state  is  a 
deviation  from  conventional  attitude  control  techniques.  The  requirement  for  a  fail-safe  feature,  in  case  of  feedback 
interruption,  must  be  taken  into  consideration;  lacking  a  feedback  signal,  the  control  surfaces  will  be  driven  to  the 
limits  in  an  attempt  to  reduce  the  error. 

The  techniques  described  in  this  paper  might  allow  for  MAV  designs  that  come  closer  to  the  performance  and 
robustness  of  natural  systems.  An  additional  objective  of  this  research  is  decreased  cost  and  complexity  of  MAV 
designs.  Obtaining  this  goal  is  dependent  upon  replacement  of  more  costly  or  complex  rate  sensors.  This  described 
control  technique  relies  on  the  ability  to  implement  the  strain  or  load  sensing  transducers  and  the  ability  to  obtain  the 
requisite  outer  loop  rate  and  tracking  estimates  from  multi-use  optical  sensors.  This  objective  configuration  mimics 
many  insect  sensor  architectures.  Nature  is  clearly  able  to  achieve  remarkable  agility,  behavioral  complexity,  and 
robustness  in  extremely  small  packages  through  mechanisms  that  scientists  and  engineers  are  only  beginning  to  under¬ 
stand.  The  human  tendency  to  attribute  undue  complexity  to  systems  that  we  do  not  understand  should  be  considered. 
A  concept  as  simple  as  reflexive  response  to  wing  load  sensing  could  potentially  explain  much  about  the  robustness 
and  performance  capability  seen  in  natural  flying  systems. 
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Abstract —  This  paper  is  motivated  by  a  desire  to  develop  an¬ 
alytical  formulations  for  cooperative  defensive  strategies  against 
predator(s).  We  formulate  a  single-pursuer,  two-evader  differen¬ 
tial  game  with  a  novel  cost  functional.  Each  of  the  three  agents 
are  modeled  as  massless  particles  that  move  with  constant 
velocity.  The  pursuer  attempts  to  capture  either  of  the  evaders 
while  minimizing  its  cost.  Simultaneously,  the  evaders  strive 
to  maximize  the  pursuer’s  cost.  The  proposed  cost  functional 
represents  the  increased  cost  to  the  pursuer  when  presented 
with  multiple,  potentially  dangerous  targets.  It  captures  the 
effect  of  cooperation  between  the  evaders.  In  order  to  solve  the 
game,  we  develop  the  optimality  conditions  for  the  equilibrium 
strategies.  We  then  integrate  the  resulting  system  of  ordinary 
differential  equations  backwards  in  time  from  the  terminal 
conditions  to  generate  the  optimal  trajectories  of  the  three  agent 
system.  The  resulting  trajectories  display  cooperative  behaviors 
between  the  two  evaders,  which  are  qualitatively  similar  to 
behaviors  found  in  predator-prey  interactions  in  nature.  Brief 
description  of  singular  surfaces  is  also  included. 

I.  Introduction 

The  use  of  unmanned  mobile  systems  is  rapidly  increasing 
due  to  a  variety  of  reasons  including  their  relative  low 
cost  and  their  ability  to  operate  in  hazardous  environments 
with  minimal  risk  to  human  life.  Multiple  cheap  unmanned 
systems,  or  agents,  can  be  deployed  simultaneously  to  ac¬ 
complish  a  task  or  mission.  A  very  important  application  of 
unmanned  systems  is  in  the  modern  battlefield  to  perform 
tasks  ranging  from  surveillance  to  direct  engagement.  In 
these  scenarios,  the  group  of  agents  are  often  in  direct 
competition  with  an  opposing  force.  It  is  therefore  impor¬ 
tant  to  find  algorithms  or  strategies  that  can  systematically 
maximize  the  value  offered  by  such  groups  of  agents. 

A  natural  setting  for  studying  such  issues  is  game  theory. 
In  this  paper,  we  introduce  a  single-pursuer,  two-evader  game 
with  a  novel  integral  cost  functional.  This  cost  functional  is 
intended  to  represent  the  risk  of  damage  or  injury  to  the 
pursuer  or  the  additional  energy  or  computational  expense 
needed  to  monitor  multiple  evaders.  During  the  game,  the 
pursuer  strives  to  minimize  this  cost  while  attempting  to 
capture  one  of  the  evaders.  Simultaneously,  the  evaders 
attempt  to  maximize  the  pursuer’s  cost  in  the  hopes  of 
making  pursuit  unattractive  from  certain  initial  conditions, 
thereby  protecting  themselves  and  their  fellow  evader.  The 

*This  material  is  based  upon  work  supported  under  a  National  Science 
Foundation  Graduate  Research  Fellowship  to  Zachariah  Fuchs. 

**Pramod  P.  Khargonekar  was  supported  by  the  Eckis  Professor  endow¬ 
ment  at  the  University  of  Florida. 


proposed  cost  functional  is  a  combination  of  a  constant  time 
penalty  and  evader  generated  cost.  The  evader  generated  cost 
component  is  based  on  the  relative  configuration  of  the  three 
agents  and  possesses  particular  characteristics  that  encourage 
the  evaders  to  attempt  flanking  maneuvers  to  surround  the 
pursuer.  As  a  direct  result  of  the  evader  generated  cost 
component,  the  optimal  evader  strategies  exhibit  cooperative 
defensive  behaviors.  It  should  be  noted  that  cooperation  is 
not  directly  imposed  as  a  requirement  of  the  solution.  Instead, 
cooperation  emerges  as  the  optimal  strategy. 

The  cooperative  behaviors  exhibited  in  the  solution  to 
this  game  are  qualitatively  similar  to  numerous  examples 
of  prey  strategies  used  in  response  to  attacking  predators. 
Some  examples  include  red-wing  black  bird  nest  defense  [1], 
meerkat  predator  mobbing  [2],  and  predator  identification  in 
guppy  schools  [3],  Such  animal  behaviors  have  been  studied 
extensively  within  the  biological  community,  and  theories 
that  explain  their  evolutionary  stability  and  advantages  have 
been  proposed  [4],  Often,  these  theories  utilize  principles 
from  game  theory.  In  particular  the  concept  of  repeated 
games  is  commonly  deployed  for  this  purpose  [5],  [6],  In 
these  approaches,  the  potential  behaviors  are  represented 
as  strategies  with  assigned  utilities  that  are  inferred  from 
empirical  data  or  based  on  the  genetic  similarity  between 
individuals.  The  different  strategies  are  then  shown  to  in¬ 
crease  the  survivability  or  fitness  of  the  genes  that  describe 
these  behaviors  over  time  or  multiple  generations.  Although 
these  approaches  explain  how  cooperation  is  optimal  in 
the  evolutionary  sense,  they  do  not  directly  address  how 
cooperation  is  beneficial  at  the  day-to-day,  system  level.  One 
goal  of  this  paper  is  to  show  how  cooperation  can  arise  as 
the  optimal  strategy  given  particular  system  dynamics  and 
cost  functional. 

Although  biologically  inspired,  our  main  motivation  for 
the  the  scenario  presented  in  this  paper  and  the  resulting 
cooperative  defensive  behaviors  comes  from  the  idea  of 
cooperative  defense  of  high  value  assets.  Just  as  in  nature, 
there  are  rarely  any  defenseless  targets,  and  attacking  forces 
usually  elect  not  to  attack  a  target  if  the  potential  for  injury 
or  high  cost  outweighs  the  benefit  of  the  attack  mission. 
Thus,  by  cooperating  to  combine  their  defensive  resources, 
a  group  of  evaders  can  make  engagement  more  costly  to  the 
attacker  than  if  they  acted  independently.  This  increased  cost 
may  then  surpass  a  tolerance  level  for  the  potential  attacker 
and  prevent  an  attack  before  it  ever  occurs.  For  example. 


through  cooperation  a  group  of  unmanned  drones  could  be 
used  to  protect  vulnerable  high-value  targets,  such  as  slow 
moving  cargo  planes,  supply  ships,  or  a  very  important 
person.  If  the  high  value  target  was  attacked,  the  drones 
could  then  engage  in  a  cooperative  defensive  maneuver.  This 
cooperative  defensive  maneuver  could  be  sufficient  to  protect 
the  intended  asset. 

Formally  introduced  by  Isaacs  [7],  pursuit-evasion  games 
and  their  variants  have  been  used  to  solve  a  wide  range 
of  problems.  Recently,  the  authors  in  [8]  use  the  same 
analysis  techniques  of  this  paper  to  examine  a  continuous 
time,  visibility  based,  single-pursuer,  single-evader  game  in 
an  environment  containing  polygonal  obstacles.  There  have 
been  several  papers  that  focused  on  combat  with  realistic 
dynamics  [9],  [10],  Pursuit-evasion  games  have  also  been 
used  to  generate  defensive  strategies  of  a  single  evader. 
In  [11],  the  authors  determine  the  optimal  strategies  for 
electronic  counter  measure  use  when  initial  conditions  are 
known.  Because  of  the  ability  to  optimize  multiple  and 
sometimes  conflicting  value  functions,  game  theory  lends 
itself  to  the  analysis  of  cooperative  systems.  In  [12],  the 
authors  modify  Isaacs’  standard  single-pursuer  homicidal 
chauffeur  game  by  allowing  multiple  pursuers  and  propose 
a  daisy-chain  formation  that  enables  capture  for  a  wider 
range  of  parameters.  A  multi-evader  pursuit  evasion  game 
was  posed  in  [13],  but  the  cost  functional  was  based  solely  on 
elapsed  time.  Because  there  was  no  direct  cost  generated  by 
the  evaders,  the  resulting  evader  behaviors  exhibit  a  scattered, 
fleeing  pattern  instead  of  a  cohesive,  cooperative  defensive 
strategy  as  seen  in  our  formulation.  The  situation  in  which 
the  evader  can  potentially  capture  or  harm  the  pursuer  is  also 
presented  in  [14]  and  [15], 

In  Section  II,  we  describe  the  system  under  consideration. 
We  also  develop  an  alternative  coordinate  system,  which 
will  simplify  later  analysis.  A  novel  evader  generated  cost 
function  is  then  developed  that  captures  the  synergy  between 
the  two  evaders  and  serves  as  the  primary  motivation  for 
cooperation.  Using  the  developed  instantaneous  cost,  we  then 
describe  the  pursuit  evasion  game  under  analysis.  In  Section 
III,  we  develop  the  optimality  conditions  and  perform  the 
necessary  integration  to  generate  the  optimal  agent  trajecto¬ 
ries.  Finally,  in  Section  IV  we  summarize  our  findings  and 
describe  future  research  directions. 

II.  SYSTEM  AND  GAME  FORMULATION 

In  this  section,  we  will  describe  the  three  agent  system 
under  analysis  and  define  the  kinematic  equations  that  control 
their  motion.  We  will  also  introduce  a  relative  coordinate  sys¬ 
tem  and  corresponding  kinematic  equations,  which  will  prove 
to  be  more  compact  and  intuitive  for  our  analysis.  After  the 
system  kinematics  are  defined,  we  develop  an  integral  cost 
function  which  is  based  on  the  relative  configuration  of  the 
three  agents.  In  the  third  section,  we  lay  out  the  motivations 
for  a  two-team  differential  game  using  the  defined  system 
kinematics  and  pursuer  cost  function. 


A.  Kinematics  of  Agents 

Consider  a  dynamic  system  with  three  agents:  two  evaders 
and  a  pursuer.  For  brevity,  we  will  often  refer  to  the  two 
evaders  as  Ej  and  E2  and  the  pursuer  as  R  The  three  agents 
are  modeled  as  massless  particles  moving  with  simple  motion 
about  an  obstacle-free,  infinite  plane.  Within  this  paper,  two 
different  but  equivalent  coordinate  systems  are  used.  The 
first  coordinate  system  is  referred  to  as  the  global  coordinate 
system  and  will  be  used  to  plot  agent  trajectories  and  other 
visualizations.  In  this  coordinate  system,  the  position  of 
each  agent  is  defined  by  its  own  pair  of  standard  Cartesian 
coordinates  (x,y).  The  velocities  of  E;-  i  =  (1,2)  and  P  are 
defined  as  (v(-,0,)  and  (vp,xj/)  respectively.  Here  v,-  and  vp 
represent  the  magnitude  of  velocities  and  0,  and  \j/  represent 
the  heading.  The  heading  angles  are  measured  counter¬ 
clockwise  from  the  positive  x-direction.  The  heading  angle 
is  the  control  variable  for  each  agent,  and  we  assume  v,- 
and  vp  are  constant.  The  state  of  system  is  completely 
defined  by  the  6-tuple,  xg  =  {xi,yi,X2,yi ,xp,yp).  The  global 
coordinate  system  is  depicted  graphically  in  Fig.  la.  The 
global  kinematic  equations  of  the  system  are  thus 

xp  =  vp  cos  tj/  yp  =  vp  sin  t ]/ 
i'i=vicos0]  yi=visin0i  (1) 

X2  =  V2  cos  62  >'2  =  V2  sin  02 

We  will  now  introduce  a  second  coordinate  system,  which 
will  represent  the  locations  of  each  of  the  evaders  relative 
to  the  position  of  the  pursuer.  This  representation  will  allow 
us  to  reduce  the  number  of  dimensions  in  later  analysis  and 
will  be  referred  to  as  the  relative  coordinate  system.  In  this 
coordinate  system,  the  state  of  the  system  is  represented 
by  the  following  6-tuple,  xr  =  (di,d2,CX,P,x,y).  The  first 
two  coordinates,  d\  and  c/2,  represent  the  distance  between 
Ei  and  P  and  the  distance  between  E2  and  P  respectively. 
The  angle  a  is  measured  counter-clockwise  from  PE\  to 
PEj.  The  angle  p  represents  the  global  rotation  of  the 
three  agent  system  and  is  measured  counter-clockwise  from 
the  positive  x-direction  to  PE j.  The  x  and  y  coordinates 
represent  global  position  of  the  pursuer.  The  six  coordinates 
can  be  separated  into  two  groups.  The  first  group,  (c/i, c/2,0:), 
contains  all  necessary  information  to  describe  the  relative 
configuration  of  the  three  agents.  The  second  group,  (]3  ,x,y), 
contains  the  global  rotational  and  translation  information.  In 
the  relative  coordinate  system,  the  evader  heading  angle,  0(, 
is  measured  counter-clockwise  from  Pit,  in  order  to  simplify 
the  kinematic  equations.  Similarly,  the  pursuer  heading  angle, 
t //,  is  measured  counter-clockwise  from  PE\ .  The  relative 
coordinate  system  is  graphically  depicted  in  Fig.  lb. 

The  global  and  relative  representations  are  related  through 
the  following  equations. 

xp=x  yP=y  (2) 

x\  =c/icos(j3)+x  yi  =  d\  sin (/3 )  +y  (3) 

X2  =c/2Cos(]3  +a)+x  y2  =  c/2  sin  (/3  +a)+y  (4) 
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Fig.  1.  Coordinate  Systems 


The  control  variables  are  related  as  follows. 

0i  =  0i+/3  02  =  02+) 3+ a  t )f=\jf  +  p  (5) 

Using  the  variables  in  the  reduced  model  with  the  dynam¬ 
ics  in  (2),  the  reduced  space  kinematic  equations  are  shown 
below. 


d  i=vicos  0i  —Vpcosyt 

(6) 

d2  —v2  cos  02  —  Vp  cos  (y/  —  ct) 
«=|sin02-+sin01 

(7) 
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=  (sin  0i  sin  t jf) 

(9) 

x=vp  cos(y/  +  fi) 

(10) 

y=vp  sin  ((//  +  /)) 

(11) 

We  further  condition  these  equations  with  the  following 
two  restrictions. 

d\  >  dc  and  c/2  >  dc  (12) 

v  1  <  vp  and  V2  <  vp  (13) 

The  first  restriction,  (12),  requires  that  both  distances  are 
greater  than  or  equal  to  the  capture  distance,  dc.  The  second 
restriction,  (13),  requires  that  the  pursuer  is  faster  than  both 
of  the  evaders,  which  ensures  that  the  pursuer  is  capable  of 
capturing  an  evader  in  finite  time. 

B.  Instantaneous  Cost  Function 

In  this  section,  we  develop  an  instantaneous  cost  function 
dependent  on  the  relative  positions  of  the  two  evaders  and 
pursuer.  The  developed  cost  function  captures  the  synergy 
between  the  evaders  and  serves  as  the  primary  incentive  for 
cooperation  within  the  evading  team.  With  respect  to  our 
biological  inspiration,  this  cost  could  model  the  risk  of  injury 
to  a  predator  caused  by  the  prey.  In  terms  of  a  man-made 
example,  the  evader-generated  cost  could  represent  the  risk 
of  damage  to  an  attacking  aircraft  from  the  targets’  defensive 
capabilities. 


Each  evader  generates  an  individual  cost,  which  is  a 
function  of  distance  between  the  evader  and  pursuer.  In  this 
paper,  exponential  cost  functions  are  used  for  Ej  and  E2: 

Ci{di)  =  kiek2{dc~dl)  ,  C2(d2)  =  hMdc~d2)  (14) 

where  the  constant  k\  defines  the  maximum  value  of  the  cost 
and  k2  controls  how  quickly  the  cost  decays  as  a  function 
of  distance.  These  functions  were  chosen  because  of  their 
simplicity,  but  more  complex  functions  could  be  used  to 
model  particular  predator-prey  or  attacker-target  interactions. 

We  provide  the  pursuer  the  ability  to  counteract  or  reduce 
these  individual  costs.  Returning  to  our  aircraft  attack  exam¬ 
ple,  the  aircraft  may  be  able  to  perform  evasive  maneuvers  or 
deploy  countermeasures  if  a  threat  is  detected.  The  detection 
of  the  threat  may  be  relatively  straightforward  if  only  a  single 
target  exists,  but  in  the  case  of  multiple  targets,  it  may  be 
necessary  to  allocate  finite  sensory  or  processing  capabilities 
between  multiple  threats.  The  decreased  vigilance  of  the 
targets  at  the  individual  level  increases  the  overall  risk  of 
damage. 

We  model  this  effect  by  defining  a  direction  of  sensory  fo¬ 
cus,  y,  for  the  pursuer.  The  direction  of  focus  is  independent 
of  (he  motion  of  P  and  is  measured  counter-clockwise  from 
PE] .  By  steering  the  direction  of  focus  toward  an  evader, 
the  pursuer  reduces  the  cost  generated  by  that  evader.  The 
resulting  reduced  costs  are  a  product  of  the  cost  reduction 
function  and  the  original  evader  cost: 

C£l(y,x)=5(y)C,(t/i)  (15a) 

CEl{y,x)=S{y-a)C2{d2)  (15b) 

where  S(-)  represents  the  cost  reduction  as  a  function  of 
the  difference  between  the  sensory  focus  angle  and  the 
angle  towards  the  evader.  In  this  paper  we  use  the  following 
definition  for  S(-). 

S(-)  =  2  I1  -cos(’)]  (16) 


The  total  evader-generated  cost  for  the  pursuer  is  the  sum 
of  the  individual  evader  costs: 

C£(y,x)  =  C£l(y,x)+C£2(y,x)  (17) 

The  pursuer  must  then  select  y*  such  that  that  the  total  cost 
is  minimized  at  any  moment  in  time.  The  minimizing  y* 
satisfies  the  following  conditions 

cosy  =  — — pv  ’  z  siny  =  — (18) 

where 

p  =  \j Cy  T  2C1C2  cos  cc  -f-  C\  (19) 

Substituting  the  optimal  y-strategy,  (18)-(19),  into  (17) 
provides  the  minimum  cost: 

cs(y\xH  ^i+c2- 

\J C\  2C1C2  cos  cc  +  C*2  (20) 

It  should  be  noted  that  this  function  evaluates  to  zero 
when  a  =  0.  This  situation  allows  the  pursuer  to  monitor 
both  evaders  simultaneously.  The  evader  cost  function  is 
maximized  when  a  =  n,  which  represents  the  scenario  in 
which  the  evaders  have  flanked  the  pursuer  and  it  can  only 
direct  its  beam  of  focus  at  the  most  costly  pursuer. 

Because  y  does  not  affect  the  system  dynamics  and  the 
pursuer  can  instantaneously  choose  any  value  for  y,  we 
will  assume  the  pursuer  always  chooses  y*.  As  a  result, 
we  will  consider  the  instantaneous  evader-generated  cost  as 
a  function  of  state  alone  and  no  longer  consider  y  in  the 
development  of  the  game.  An  additional  constant  cost  term, 
C; ,  is  added  to  the  evader-generated  cost  in  order  to  represent 
a  time  or  energy  penalty  for  the  pursuer.  The  total  pursuer 
instantaneous  cost  is  then 

CT(x)=CE(x)+ct  (21) 

C.  Game  Formulation 

The  instantaneous  cost  function  (21)  is  integrated  over 
time  to  calculate  the  total  cost  to  the  pursuer  over  a  single 
play  of  the  game.  In  this  game,  termination  occurs  when  the 
pursuer  captures  one  of  the  evaders,  which  happens  when 
the  state  passes  through  the  terminal  surface: 

r(x)  =  {dl-dc){d2-dc)=0  (22) 

The  cost  to  the  pursuer  for  a  game  starting  at  initial  time  to 
and  reaching  the  terminal  surface  at  time  tj  is  then  defined 
as: 


We  can  now  pose  a  differential  game  in  which  the  goal 
of  the  two  evaders  is  to  maximize  the  integral  cost  to  the 
pursuer,  (23).  By  inspection,  it  can  be  seen  that  in  general 
the  evaders  should  strive  to  delay  termination  of  the  game  in 
order  to  continue  the  integration  of  cost.  Simultaneously,  the 


the  pursuer  strives  to  minimize  its  cost  my  terminating  the 
game  as  soon  as  possible  while  attempting  to  avoid  potential 
flanking  maneuvers  of  the  evaders. 

Although  there  are  three  agents  in  this  system,  the  two 
evaders  share  a  common  goal,  maximize  the  pursuer’s  cost. 
Therefore,  the  evaders  can  be  thought  of  as  a  single  player 
with  two  control  variables.  This  perspective  results  in  a  two- 
player  zero-sum  game;  one  player  is  the  pursuer,  while  the 
other  player  represents  the  evading  team.  We  can  then  define 
a  function  V (x),  which  represents  the  value  of  a  game  that 
starts  at  point  x  and  in  which  both  players  implement  their 
optimal  strategies. 

In  this  paper,  we  assume  that  all  agents  possess  complete 
knowledge  of  all  state  variables.  The  pursuer  does  not  pos¬ 
sess  knowledge  of  either  evader’s  control  while  the  evaders 
are  ignorant  of  the  pursuer’s  control  as  well. 

III.  Solution  to  the  Game 

In  this  section  we  will  develop  the  solution  to  the  game. 
For  this  paper,  we  will  examine  the  case  where  Vi  =  v’2  = 
1,  k\  =  k,2  =  1,  and  ct  >  0.  It  is  assumed  that  to  =  0.  We 
will  first  calculate  the  optimality  conditions  that  describe  the 
optimal  control  strategies.  Using  the  calculated  optimality 
conditions,  we  numerically  integrate  backwards  in  time  to 
generate  the  optimal  trajectories.  We  will  then  discuss  some 
of  the  interesting  singular  surfaces  generated  within  the  state- 
space  and  their  effects  on  the  optimal  state  trajectories.  All 
of  the  following  calculations  are  performed  using  the  relative 
coordinate  system. 

A.  Optimality  Conditions  for  the  Game  of  Attack 

In  order  to  find  the  optimal  control  strategies  and  the 
resulting  state  trajectories,  we  begin  by  calculating  the  op¬ 
timality  conditions  of  differential  games  first  described  by 
Rufus  Isaacs  [7].  Using  the  defined  kinematic  equations,  (7)- 
(11),  and  the  cost  functional  (21),  the  Hamiltonian,  H,  is 
introduced  as 

H  =  Arf(x,Mp,Ue)  +CT 
=  X\d\  -\-X2d2  +  Xa6l-\- 
Xp  /i  +  Xxx  +  Xyj  +  Cj  (24) 

The  vector  X  =  (X\  X2  Xa  Xp  Xx  Xy)T  contains  the  ad¬ 
joint  variables  conjugate  to  the  kinematic  equations.  The 
adjoint  equations  are  found  by  taking  the  partial  derivative 
of  the  Hamiltonian  with  respect  to  their  respective  state 
component: 

<25) 

i2=-g  =  -^U-r^-g  <26) 

T  _  dH  _  T  dd2  T  dec  dCj 

A(x-  ~d^~  ~A2^~Aaddc~~ddT 

T  _  dH  _  t  dxp  ^ 

~  da  ~  Ax  <9/3  Ay  d[ 3  (2°J 

**=-%=0  (29> 

Ay=-f=0  (30) 


The  boundary  conditions,  VP,  for  the  game  are 


/  d\  (to)  —d\o 

d2(to)  —  c/20 
«(fo)-«o 
P  (to)  -ft 

x(t0)  -xo 

y(to )  -yo 

V  ( d\  (tf)  —  dc){d2(tf) 


\ 


=  0 


dc)J 


(31) 


where  d  10,  c/20,  cto-  fio,  -*'01  and  yo  are  the  initial  values  of 
their  respective  state  components  at  the  start  of  the  game.  In 
order  to  determine  the  boundary  constraints  on  the  adjoint 
variables,  we  use  the  boundary  conditions,  (31),  to  create  a 
function  of  terminal  conditions,  <f>: 


T>  =  v7’xP  (32) 

where  v  =  ( Vi  V2  V3  V4  V5  V6  V7  )T  contains  the  adjoint 
variables  conjugate  to  the  boundary  constraints  of  the  state. 
Taking  the  partial  derivatives  of  (32)  with  respect  to  the  state 
components  provides  the  terminal  conditions  for  the  adjoint 
variables: 


Wf)  =  7zfa  =  v%(d2-dc)  (33) 

^f)  =  J^f)=Mdx-dc)  (34) 

*«('/)=!« %  =  °  (35) 

Wf)=am=°  (36) 

**('/)=  3 f§)=°  (37) 

Wf)=mr)=°  (38) 

Using  the  adjoint  derivatives,  (28)-(30),  and  the  terminal 
constraints,  (36)-(38),  it  is  found  that 

Aj9(f)=  0  4(0=0  4(0=0  (39) 

Substituting  (39)  into  (24),  results  in  a  simplified  Hamilto¬ 
nian,  which  is  dependent  only  on  the  components  of  the  state 
that  describe  the  relative  configuration  of  the  three  agents: 

H  =  X\d\  +  h2d2  T  hacc  +  Cj  (40) 


The  next  step  in  solving  the  game  is  to  determine  the 
optimal  strategies  for  the  three  agents,  which  we  will  denote 
as  04  and  y/*.  For  regions  in  which  the  gradient  of 
the  value  function  is  continuous,  the  optimal  strategies  must 
satisfy  two  conditions,  which  are  often  referred  to  as  Isaacs 
Conditions.  The  regions  in  which  the  value  function  or  its 
gradient  is  discontinuous  are  called  singular  surfaces  and  will 
be  discussed  in  a  later  section. 

Theorem  1:  Suppose  that  the  value  function  and  the  value 
function  gradient  are  continuous.  The  control  strategies  for 
the  three  agents  are  then  given  by 
Optimal  Control  Strategy  of  Ei: 


cos  0!* 
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pi 


sin  0j* 


ha 

d\P\ 


Pi 


(41) 


Optimal  Control  Strategy  of  E2: 


cos  0| 


4 

P2 


sin  9  2 


ha 

d2p2 


P2  =  \/hi  +  {^)2  (42) 

Optimal  Control  Strategy  of  P: 

*  ci  Cl 

cos  1 jr  = - sin  y/  =  — - 

Pp  Pp 

Pp  =  \/ci+c2  (43) 

where 

ci  =  7/f  si*1  a  ~  4  —  4  cos  a  (44) 

c'2  =  —  4  sin  a  —  ^  cos  a  (45) 

The  proof  of  this  theorem  is  omitted  to  satisfy  space  con¬ 
straints,  but  it  can  be  found  in  the  extended  version  of  this 
paper. 

B.  Numerical  Analysis 

Finding  an  analytic  solution  to  the  optimal  trajectories  is 
not  practical  due  to  the  nonlinear  and  coupled  nature  of  the 
state  and  adjoint  equations.  In  order  to  numerically  generate 
the  optimal  trajectories,  we  first  substitute  the  optimal  control 
strategies  (41)-(43)  into  the  the  kinematic  equations  (7)-(  11) 
and  the  adjoint  equations  (25)-(30).  The  resulting  system  of 
twelve  ordinary  differential  equations  describe  the  optimal 
trajectories  of  the  three  agents  and  the  corresponding  costates 
for  this  game.  We  can  then  numerically  integrate  backwards 
in  time  from  the  terminal  surface  to  generate  the  optimal 
trajectories. 

To  find  the  initial  conditions  for  integration  we  consider  a 
point  on  the  terminal  surface: 

xf  =  (dif,  d2f,  CCf ,  fif,  Xf ,  y/)T  (46) 

where  d\f  =  dc  and  c/2/  >  dc.  From  (34)-(38),  we  find  the 
terminal  adjoint  vector: 

hf  =  {vs(d2-dc),0,  0,  0,  0,  0  f  (47) 

After  substituting  the  optimal  control  strategies  into  the 
Hamiltonian  and  evaluating  at  the  terminal  state,  we  may 
solve  directly  for  4 f- 

|4/|  =  (48) 

Knowing  that  Ei  attempts  to  delay  capture  by  increasing  d\, 
we  use  the  positive  value  for  h\ .  It  should  be  noted  that  on 
the  portion  of  the  terminal  surface  that  represents  the  capture 
of  Ei,  the  terminal  control  for  E2  is  undefined  at  the  moment 
of  capture.  Conceptually  this  makes  sense  because  E2  can  do 
nothing  to  further  delay  capture  of  Ei,  and  any  change  it  can 
produce  in  Cp  will  have  no  effect  on  the  integral  cost  because 
the  game  has  ended.  But  in  order  to  perform  the  numerical 
integration,  it  is  necessary  to  know  the  control  for  E2  to  start 
the  numerical  integration.  For  this  purpose  we  can  use  the 


Fig.  3.  Optimal  Trajectories  for  d2f  —  7,  otf  —  .8,  and  vp  —  2.5 


Fig.  5.  Optimal  Trajectories  for  ^2/  —  7,  (Xf  =  2.8,  and  vp  =  1.1 


control  just  before  capture,  which  can  be  found  by  taking 
the  limit: 


lim  tan  0?  =  lim 

t^rtf  t  — f  “2^2 


/  Hf  AAo  (9/7/1 2 


ha  (>f) 

d2fX2(tf) 


(49) 


We  can  now  use  the  given  terminal  state  xy  (46),  terminal 
values  found  for  X  y  (47),  and  the  terminal  control  for  E2  (49) 
as  initial  conditions  for  our  backwards  in  time  numerical 
integration.  The  state  equations  are  then  integrated  over 
the  time  period  of  interest  or  until  the  trajectory  reaches  a 
dispersal  surface. 


C.  Illustrative  Cases 

After  the  integration  is  performed,  the  resulting  trajectories 
in  the  reduced  coordinate  system  can  then  be  mapped  to 
trajectories  in  the  global  coordinate  system  using  (2)-(4). 
Several  illustrative  cases  are  displayed  in  Fig.  2  through 
Fig.  5.  In  each  of  these  figures,  the  solid  line  represents 
the  trajectory  of  the  pursuer;  the  dashed  line  represents  the 
trajectory  of  Ei;  and  the  dotted  line  represents  the  trajectory 
of  E2.  In  all  three  cases,  the  game  is  terminated  when  Ei 
is  captured.  The  terminal  position  of  the  three  agents  are 
marked  by  an  x.  The  markers  along  the  curves  represent 
the  agent  locations  in  two  second  increments.  In  Fig.  2,  E2 


rushes  to  meet  near  the  point  of  capture  in  order  to  perform  a 
last  ditch  flanking  maneuver  and  create  a  large  accumulation 
of  cost  just  before  capture.  This  results  in  a  counter  flanking 
maneuver  by  the  pursuer  just  before  capture.  An  enlarged 
view  of  the  trajectories  just  before  capture  can  be  seen  in 
Fig.  4. 

In  Fig.  3,  the  pursuer  utilizes  its  speed  advantage  and  per¬ 
forms  a  counter  flanking  maneuver  against  the  two  evaders  in 
order  to  minimize  the  evader  generated  cost.  In  this  scenario, 
E]  can  increase  the  cost  to  the  pursuer  more  by  attempting 
to  remain  close  and  flanking  as  opposed  to  a  strategy  of 
maximizing  the  time  of  the  game  by  fleeing.  This  is  similar  to 
a  fight  or  flight  decision  in  nature.  Ei  knows  that  it  presents 
more  of  a  cost  to  the  pursuer  by  making  a  stand,  and  the 
evaders  hope  that  this  cost  may  be  more  than  the  pursuer  is 
willing  to  accept  and  therefore  aborts  the  attack.  Although 
the  initial  conditions  of  this  scenario  shown  in  Fig.  5  are 
similar  to  Fig.  3,  the  pursuer  does  not  possess  the  same  speed 
advantage.  Therefore,  it  does  not  try  to  outflank  the  evaders 
and  instead  takes  a  more  direct  approach  towards  Ei.  Also, 
Ei  can  accumulate  more  cost  by  running  away  and  dragging 
the  game  out  for  a  longer  period  of  time.  Again,  this  is  a 
fight  or  flight  situation,  but  it  is  more  advantageous  for  Ei 
to  flee.  Throughout  the  game,  E2  continues  to  harass  the 


pursuer  from  behind  and  accumulate  cost. 

D.  Singular  Surfaces 

The  value  function  generated  by  the  optimal  control  strate¬ 
gies  divides  the  state  space  into  mutually  disjoint  regions. 
Within  these  regions,  the  value  function  is  well  defined  by 
the  optimality  conditions.  The  manifolds  that  divide  these 
regions  are  called  singular  surfaces  and  are  characterized  by 
at  least  one  of  the  following  three  characteristics:  the  optimal 
control  strategies  are  not  uniquely  determined  by  optimality 
conditions  previously  described,  the  value  function  is  not 
continuously  differentiable,  or  the  value  function  is  discon¬ 
tinuous  [16].  Most  singular  surfaces  are  not  identified  by 
backward  integration  of  the  optimal  trajectories  and  require 
further  analysis  in  order  to  describe  the  system  behavior  on 
or  near  these  surfaces. 

Within  this  game,  symmetry  in  the  kinematic  equations 
and  cost  function  hint  at  the  existence  of  particular  singular 
surfaces.  We  will  begin  our  analysis  of  the  singular  surfaces 
by  looking  at  the  a  =  0  plane.  On  this  plane,  the  three  agents 
are  in  a  collinear  configuration  with  both  evaders  on  one  side 
of  the  pursuer.  The  pursuer  can  then  direct  its  beam  of  focus 
at  both  evaders  simultaneously,  thereby  completely  negating 
the  evader-generated  cost.  As  a  result,  the  pursuer  would 
like  to  keep  the  state  near  the  a  =  0  plane  while  the  evaders 
attempt  to  force  the  state  away  from  this  plane.  For  the  case 
where  ct  >  0,  the  a  =  0  plane  represents  a  dispersal  surface. 
A  dispersal  surface  is  a  surface  within  state  space  in  which 
one  or  both  of  the  players  can  select  from  multiple  optimal 
control  strategies.  Each  of  these  strategies  moves  the  state  off 
of  the  surface  in  different  directions,  but  will  result  in  the 
same  value  for  the  game.  In  this  paper,  for  any  game  that 
begins  with  the  initial  state  on  the  a  =  0  plane,  the  evaders 
make  an  initial  choice  to  force  the  a-component  of  the  state 
away  from  zero  in  either  the  positive  or  negative  direction. 
Either  direction  results  in  the  same  value  of  the  game  because 
of  the  symmetry  of  the  state  equations,  cost  function,  and 
the  resulting  adjoint  equations.  Although  the  pursuer  could 
attempt  to  hold  the  state  near  the  a  =  0  plane,  the  slight 
reduction  of  evader-generated  cost  would  be  out  weighed  by 
the  increased  time  penalty.  This  dispersal  surface  appears  as 
a  discontinuity  of  the  gradient  of  the  value  function  in  the 
alpha-direction. 

The  d\  =  th  plane  is  also  singular  surface.  The  portion  of 
this  plane  where  a  >  ?  is  clearly  a  dispersal  surface  where 
the  pursuer  chooses  an  evader  to  capture  and  forces  the  state 
off  of  the  plane  in  that  direction.  Under  certain  conditions, 
the  region  of  the  c/i  =  c/i  plane  near  the  intersection  with  the 
a  =  0  plane  has  the  potential  for  a  singular  focal  surface.  In 
this  paper,  we  only  consider  initial  starting  positions  above 
a  =  %  on  the  d\  =  di  plane. 

IV.  Conclusions 

This  paper  has  developed  a  novel  single-pursuer,  two- 
evader  pursuit  evasion  game  with  an  integral  cost  functional. 
In  this  game,  the  pursuer  strives  to  minimize  the  total  integral 
cost  over  the  course  of  the  game.  The  two-evaders  are 


represented  as  a  single  player  with  two  control  variables 
and  attempt  to  maximize  the  pursuer’s  cost.  The  generated 
optimal  trajectories  show  that  the  proposed  cost  function 
generates  cooperative  defensive  behaviors  between  the  two 
evaders.  These  behaviors  are  similar  to  defensive  grouping 
and  predator  mobbing  found  in  nature  and  could  be  used  in 
groups  of  unmanned  systems  in  order  to  make  attack  a  less 
appealing  option  for  an  opposing  force. 

Future  work  consists  of  assigning  a  cost  threshold  in  which 
the  pursuer  would  elect  not  to  attack.  We  could  then  define  a 
region  in  state  space  for  which  pursuit  would  be  too  costly. 
We  would  also  like  generalize  the  system  to  work  for  an 
arbitrary  number  of  evaders.  Although  the  evader  generated 
cost  can  easily  be  extended  to  account  for  more  agents, 
the  new  evaders  would  create  many  more  singular  surfaces 
that  would  greatly  increase  the  complexity  of  the  game  and 
require  further  analysis. 
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This  paper  discusses  the  collection,  post-processing  and  subsequent  evaluation  of  flight 
data  of  butterflies,  in  various  free  flight  scenarios.  A  vision  tracking  system  is  used  to 
obtain  the  flight  data;  and  this  in  turn  is  used  to  determine  estimates  of  the  motion  of 
different  body  parts  of  the  insect,  including  the  abdomen  and  the  wings.  These  estimates 
are  subsequently  analyzed  with  a  view  to  establishing  the  manner  in  which  the  insect  adapts 
the  motion  of  its  abdomen  to  work  in  tandem  with  the  motion  of  its  wings.  Furthermore, 
the  manner  in  which  this  adaptation  changes  through  different  flight  phases  is  studied. 


I.  INTRODUCTION 

The  aerospace  engineering  community  is  increasingly  interested  in  the  flight  mechanics  and  dynamics  of 
small  flapping  air  vehicles,  Figure  1(a),  and  natural  organisms  in  the  low  Reynolds  number  regime.  The 


Figure  1.  (a)  A  20  cm  wingspan  ornithopter  with  a  flexible  wing  in  the  REEF  small  wind  tunnel,  (b)  The  rain  forest 

at  the  McGuire  Center  for  Lepidoptera  and  Biodiversity,  Gainesville,  FL. 


observation  and  study  of  flying  animals  offers  a  significant  source  of  bio-inspiration  in  several  aeronautical 
disciplines  including  highly  dynamic  adaptive  structures.  While  flight  measurements  of  biological  systems  are 
relatively  abundant,  a  meaningful  recording  of  the  data  and  an  efficient  distillation  of  their  results  is  a  work- 
in-progress  endeavor.  Comprehensive  data  on  insects  flying  in  their  natural  environments  are  extremely  rare. 
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This  paper  presents  the  experimental  techniques  used  for  collecting  live  flight  data  from  Lepidoptera  in  their 
natural  environment  and  illustrates  results  on  their  significant  capabilities  to  adapt  their  intricate  wings- 
abdomen-thorax  system  to  a  variety  of  flight  conditions  including  some  extremely  aggressive  non-steady 
maneuvers. 

The  flight  measurements  were  performed  at  the  Butterfly  Rainforest  at  the  McGuire  Center  for  Lepi¬ 
doptera  and  Biodiversity,  Figure  1(b),  which  is  a  650  square  meter  screened  vivarium  at  the  Florida  Museum 
of  Natural  History  in  Gainesville,  FL.  This  center  houses  over  460  species  of  subtropical  and  tropical  plants 
and  trees  to  support  up  to  2,  000  free-flying  butterflies  of  120  different  species.  Natural  fliers  demonstrate 
a  diverse  array  of  flight  capabilities,  many  of  which  are  poorly  understood.  NASA  established  a  research 
project  to  explore  and  develop  flight  technologies  inspired  by  biological  systems.1  Aerodynamic  research  on 
flapping  insect  wings  revealed  mechanisms  such  as  leading  edge  vortices  (LEVs)  and  offered  design  criteria 
for  insect-based  flying  machines.2 

There  have  been  numerous  research  projects  performed  by  the  biology  community  on  the  flight  and 
structural  behavior  of  insects.  Significant  research  was  performed  presenting  measurements  of  insects  flight 
data  considering  the  specimen  as  a  multi-body  system  including  head  and  thorax.  For  example,  in  the 
Calliphora  vicina  (Blowfly)  it  was  shown  that  there  exists  a  high  level  of  correlation  between  the  head  and 
thorax  movements;  these  were  measured  using  sensor  coils  and  during  the  insect’s  saccades  angular  rates  of 
a  few  thousands  of  degrees  per  second  were  observed.  A  relevant  contribution  of  the  abdomen  posture  on 
flight  control  mechanisms  was  presented  in  the  male  of  Schistocerca  gregaria  (Male  desert  locust)  suggesting 
that  the  sensory  cue  evoking  the  yaw  response  is  a  change  in  the  direction  of  the  relative  wind,  monitored 
by  the  cephalic  wind  receptor  hairs.4  The  adaptability  of  the  Lepidoptera  to  different  flight  requirements 
was  observed  by  a  non-symmetric  passive  wing  twisting  during  upstroke  and  downstroke  in  the  Insecta 
Papilionoidea5  and  during  the  highly  un-steady  take-off  phase  in  the  Pieris  melete.6  Using  the  evolution  of 
neotropical  butterflies  as  a  natural  experiment,  a  correlation  between  body  center  of  gravity  position  and 
flight  maneuverability  was  demonstrated  focusing  on  the  relative  proportions  of  the  thorax  and  abdomen 
as  well  as  the  palatability  characteristics  of  different  species  of  butterflies.7  Flight  data  gathered  during 
previous  work  on  Idea  Leuconoe  (Tree  Nymph)  showed  an  apparently  significant  abdomen  activity  in  certain 
flying  phases  with  a  significant  correlation  with  the  flapping  wing  and  body  dynamics. 

There  has  also  been  prior  work  done  on  locusts  tethered  in  a  wind  tunnel  with  the  objective  of  studying 
their  longitudinal  flight  dynamics. 10-12  Force  measurements  are  obtained  in  the  wind  tunnel,  which  are  then 
used  to  determine  stability  derivatives  of  the  insect  under  different  relative  wind  velocities  and  angles  of 
attack.  The  literature  also  comprises  of  a  discussion  on  the  use  of  CFD  based  modeling  for  the  purpose  of 
studying  insect  aerodynamics  and  flight  dynamics.13  This  paper,  on  the  other  hand,  relies  on  free  flight  data 
-  an  advantage  of  this  is  the  ability  to  study  different  maneuvering  flight  phases  of  the  insect;  this  advantage 
however  is  usually  tempered  by  the  fact  that  the  accuracy  of  wind  tunnel  data  is  often  better  than  that  of 
free  flight  data.  This  work  is  aimed  at  determining  the  mechanisms  used  by  live  butterflies  at  adapting  their 
intricate  abdomen  and  elastic  wings  system  to  non-steady  flight  conditions.  These  approaches  could  lead  to 
the  development  of  new  flight  mechanics  strategies  for  micro  and  nano  air  vehicles. 

II.  The  Experimental  Set  Up  and  Post  Processing 

The  design  of  the  data  acquisition  system  (DAQ)  was  based  on  two  key  requirements:  being  non-obtrusive 
and  having  the  capability  of  field  measurements,  i.e.  allowing  measurements  in  the  insects’  natural  envi¬ 
ronment.  A  vision-based  estimation  method  is  used  to  study  the  insect  flight  with  insignificant  interference 
with  the  natural  behavior  of  the  insects.  The  visual  system  is  composed  of  two  high-speed  digital  cameras 
synchronized  as  a  stereo  pair,  as  schematically  illustrated  in  Figure  2(a).  A  stereo  pair  of  cameras  with 
known  parameters  and  relative  pose  allows  estimation  of  3D  position  of  points  in  space.  The  measurements 
were  performed  under  natural  sunlight  conditions  at  100  —  200  frames  per  second  and  resolutions  of  800x600 
pixels.  Figure  2(b)  shows  the  cameras  and  computer  hardware  at  their  experimental  location. 

A  sequence  of  pictures  of  the  desired  event  is  captured  from  both  cameras  and  converted  to  two  videos, 
one  for  each  camera,  using  a  combination  of  custom  and  commercial  software.  The  videos  are  digitized  using 
a  stereoscopy  tracking  software15  and  accurate  camera  calibration  data  in  order  to  perform  3D  stereovision 
estimation  of  selected  points  on  the  target.  The  tracking  software  employed  uses  a  11-point  Direct  Linear 
Transformation  (DLT)  method  for  calibration.14  The  validation  of  the  data  acquisition  and  post  processing 
methodology,  including  an  estimation  of  the  uncertainties  was  achieved  by  using  a  custom  made  target 
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Figure  2.  (a)  The  stereo  triangulation  technique  used  by  the  cameras,  (b)  The  stereo  cameras  in  the  experimental 

environment . 


consisting  of  multiple  spring-mass  components  mounted  on  a  shaker.  The  target  has  three  parts  simulating 
a  body,  a  head  and  an  antenna;  and  the  shaker  is  controlled  by  a  computer  which  can  induce  any  desired 
oscillatory  motion  to  the  body.  The  targets  three-dimensional  position  in  time  was  measured  using  a  high 
resolution  dynamic  visual  image  correlation  (VIC)  normally  used  in  experimental  mechanics.9  Comparisons 
with  the  positions  acquired  by  the  tracking  software  selected  for  the  measurements  on  butterflies  provide 
estimates  of  the  experimental  uncertainties.  A  sample  frame  from  a  video  is  illustrated  in  Figure  3,  describing 
an  Idea  Leuconoe  (Tree  Nymph)  butterfly  during  an  approach  for  landing  on  a  leaf.  In  this  case  a  total  of 
four  points  including  the  tips  of  the  left  and  right  wings,  the  tip  of  the  abdomen  and  the  abdomen  root, 
were  tracked. 


Figure  3.  Tracking  of  body  parts  of  a  butterfly  during  natural  flight.  Note  the  tracked  points  on  the  wing  tips  and  the 
abdomen. 


The  raw  data  may  contain  substantial  voids  due  to  points  on  the  target  being  occluded,  usually  by  other 
body  parts  or  by  foliage.  Methods  of  statistical  curve-fitting  to  fill  in  missing  data  are  used  and  smooth  time 
histories  of  the  3D  position  estimation  are  obtained,  as  depicted  in  Figure  4. 
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Figure  4.  Three-dimensional  estimates  of  wing  tips  of  a  butterfly  during  natural  descending  flight.  The  trajectories 
are  depicted  after  the  smoothing  process. 


III.  Results 

Data  obtained  from  the  live  measurements  is  the  processed  with  several  objectives  in  mind.  The  kinemat¬ 
ics  and  dynamics  of  the  butterflies’  flight  are  the  focus  of  the  flight  mechanics  segment  of  the  overall  project. 
The  shape-changing  and  relative  elastic  deformation  of  the  various  body  parts,  specifically  the  abdomen,  the 
wings,  the  head  and  the  antennae  are  the  focus  of  the  structural  and  aeroelastic  segment.  A  combination 
of  these  two  segments  enables  the  investigation  of  possible  correlations  with  the  overall  flight  trajectory  and 
performance  of  the  insect.  Three  samples  of  the  numerous  flight  events  recorded  in  several  sessions  will  be 
presented  as  examples  of  dynamic  in-flight  adaptation  of  the  body-wings  system. 


Figure  5.  Several  flapping  cycles  of  abdomen  tip  and  wings  demonstrated  during  a  fly-by  sequence 


Before  we  go  into  a  more  detailed  study  of  the  relative  motion  of  the  wings  and  abdomen  in  highly 
maneuvering  flight,  we  briefly  present  a  figure  from  a  slightly  benign  flight  sequence  which  is  of  a  relatively 
long  duration  and  that  enables  us  to  witness  several  flapping  cycles.  This  is  given  in  Figure  5,  which  shows 
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the  vertical  axis  position  of  the  wing  tips  and  the  abdomen  tip  relative  to  the  abdomen  root. 

This  figure  clearly  shows  that  for  all  the  flapping  cycles  on  display,  the  abdomen  tip  motion  is  nearly  180 
deg  out  of  phase  with  the  wing  tip  motion.  Also  shown  in  the  figure,  is  the  vertical  axis  displacement  of  the 
abdomen  root.  In  the  particular  flight  sequence  shown  in  Figure  5(a),  the  insect  exhibited  some  periodicity 
in  its  overall  flight  trajectory,  and  interestingly  the  number  of  cycles  of  the  abdomen  root  motion  is  exactly 
equal  to  the  number  of  cycles  of  motion  of  each  of  the  wing  and  the  abdomen.  Furthermore,  Figure  5(a) 
demonstrates  what  appears  to  be  a  clear  phase  lag  of  the  translational  motion  of  the  abdomen  root  (which 
represents  the  overall  motion  of  the  insect),  relative  to  the  wing  and  the  abdomen  tip  motion;  and  this  seems 
to  indicate  that  the  insect  is  using  its  abdomen  as  an  active  control  device,  at  least  during  this  particular 
flight  phase. 

The  flight  discussed  above  does  not  comprise  of  any  significantly  rapid  maneuvers.  We  now  turn  our 
attention  to  a  flight  (of  an  insect  from  the  same  species)  that  does  comprise  a  sequence  of  rapid  maneuvers. 
More  specifically,  we  now  look  at  a  flight  that  comprises  of  an  acceleration  phase,  followed  by  a  deceleration, 
followed  by  a  turn  and  finally  a  phase  of  pure  descent.  During  the  180  degree  saccade  on  a  horizontal 
plane  with  near  zero  turning  radius  the  butterfly’s  abdomen  adapts  to  the  wing  motion  and  significantly 
contributes  to  the  dynamics  of  the  turn.  This  180  degree  turn  in  yaw  was  performed  in  Flight  072708_0101 
by  an  Idea  leuconoe  (Tree  Nymph).  A  few  snapshots  of  the  flight  are  given  in  Figure  6. 


Figure  6.  Sequence  during  the  Tree  Nymph  saccade.  In  (1)  the  butterfly  starts  a  rapid  deceleration  with  an  aggressive 
yaw  (2)  and  roll  (3)  motions.  In  (4)  it  starts  a  steady  descent. 


The  sequence  of  phases  during  this  flight  are  as  follows: 
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a)  The  insect  initially  accelerates  in  the  forward  direction,  while  reducing  the  y-axis  component  of  its  velocity 
to  near  zero.  This  is  clearly  brought  out  in  Figure  7  which  shows  the  insect  velocity  (which  is  represented  by 
the  velocity  of  the  abdomen  root),  b)  It  then  decelerates  as  it  readies  itself  for  a  turn.  In  this  phase,  first  the 
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Figure  7.  Overall  velocity  of  the  insect 


forward  velocity  of  abdomen  root  reduces  till  it  comes  down  to  zero.  The  abdomen  tip  however  continues  to 
move  forward  with  some  velocity.  After  about  0.1  sec,  the  forward  velocity  of  abdomen  tip  too  reduces  to 
zero. 

c)  The  head-thorax  turns  upwards,  as  the  abdomen  tip  swings  around,  while  the  abdomen  root  performs  a 
yawing  turn.  This  is  evident  in  Figure  8,  which  shows  the  trajectory  of  the  abdomen  root  and  the  abdomen 
tip  on  a  horizontal  plane.  In  order  to  show  the  relative  position  of  the  abdomen  root  and  tip  during  this 
trajectory,  a  line  (dotted  red)  is  also  shown.  One  can  thus  see  the  motion  of  the  abdomen  root  is  initially 
curved  as  it  takes  a  turn  and  then  (its  projection  on  the  horizontal  plane)  moves  along  a  straight  line.  The 
abdomen  tip  however  continues  to  swing  around  even  after  the  abdomen  root  has  stopped  its  turn.  The 
abdomen  activity  also  seems  to  indicate  that  there  is  an  adaptation  of  the  insect  mass-distribution  for  this 
small  radius  maneuver. 

d)  The  wings  typically  flap  in  phase  in  symmetric  flight  although  in  this  case,  around  this  time,  the  wing 
tips  are  at  180  deg  out  of  phase  with  each  other,  while  the  insect  performs  a  roll.  The  fact  that  they  become 
180  deg  out  of  phase  is  evidenced  in  Figure  9. 

e)  The  insect  loses  altitude  as  the  wing  tips  take  some  finite  time  to  get  back  to  flapping  in  phase  with  each 
other.  The  wing  tips  take  about  0.1  sec  to  transition  from  180  deg  out  of  phase  to  back  in  phase.  During 
that  time  interval,  the  insect  loses  close  to  100  mm  in  altitude.  All  of  these  are  evidenced  in  Figure  9. 

f)  As  the  two  wings  get  back  together  in  phase  with  one  another,  at  the  same  time  the  abdomen  tip  gets 
itself  back  to  180  deg  out  of  phase  with  the  wings,  which  would  represent  the  normal  symmetric  flying 
condition.  This  is  also  evidenced  in  Figure  9.  At  the  beginning  of  the  saccade  the  insect  gives  priority  to  the 
aerodynamic  effects  of  the  wings  as  well  as  positioning  the  abdomen  vertically  to  maximize  the  drag.  At  the 
apex  of  the  saccade  the  butterfly  is  using  the  abdomen  inertia  to  execute  a  snap-roll  and  later  to  stabilize 
the  flapping  to  go  in  a  descending-hovering  mode.  The  mass  and  inertia  of  the  relatively  heavy  abdomen 
is  dynamically  adapted  for  the  various  phases  of  saccade.  A  relevant  contribution  to  the  gyration  inertia 
management  is  also  attributed  to  the  wings’  moving  and  closing  at  strategic  times. 

Figure  10  demonstrates  the  left  wing  tip  and  abdomen  tip  trajectories,  shown  on  a  vertical  plane,  relative 
to  the  moving  abdomen  root.  The  abdomen  root  is  thus  always  at  the  origin  of  this  figure.  It  is  clearly 
seen  that  as  the  wing  tip  moves  back  and  forth,  as  well  as  up  and  down,  the  abdomen  tip  too  shows  a 
back  and  forth  and  up  and  down  motion.  The  starting  points  on  each  of  the  two  curves  are  marked  out 
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Abdomen  tip  trajectory  projected  on  a  horizontal  plane  (X  Z),  during  a  180  degree  turn 


x  (mm) 


Figure  8.  Abdomen  root  and  tip  trajectory  projected  on  a  horizontal  plane  during  a  180  degree  turn. 


in  small  green  circles.  During  the  initial  acceleration  phase,  the  wing  tip  executes  a  stroke  that  is  close  to 
horizontal  -  this  is  in  agreement  with  general  intuition  that  horizontal  wing  strokes  would  be  used  for  thrust 
generation.  The  abdomen  tip  stroke  during  this  phase  has  both  horizontal  as  well  as  vertical  components. 
During  the  deceleration  phase,  the  insect  reduces  its  wing  flapping  speed  and  brings  its  wing  tips  high  above 
the  abdomen  root.  There  is  an  accompanying  twisting  motion  of  the  wings  around  this  time  (not  visible  on 
Figure  10),  as  the  insect  uses  its  wings  as  an  airbrake  of  sorts  to  slow  itself  down.  Simultaneously,  it  is  seen 
that  the  insect  brings  its  entire  abdomen  almost  directly  under  the  abdomen  root  and  the  horizontal  stroke 
of  the  abdomen  is  very  short  during  this  phase.  This  is  probably  because  the  insect  wants  to  sustain  the 
abdomen  in  as  close  to  a  vertical  position  as  possible  since  doing  so  can  ensure  that  the  abdomen  contributes 
to  the  drag  force.  It  then  executes  the  turn,  after  which  there  is  a  phase  of  pure  descent  during  which  both 
the  wing  as  well  as  the  abdomen  execute  an  almost  vertical  stroke. 

Figure  10  thus  demonstrates  that  the  insect  possesses  significant  abdomen  motion  to  accompany  the  wing 
motion.  In  certain  flight  phases,  such  as  the  deceleration  for  instance,  the  role  of  the  abdomen  seems  to 
complement  that  of  the  wing,  as  the  insect  uses  both  of  them  in  a  manner  to  increase  the  drag  it  experiences. 
In  certain  other  flight  phases,  such  as  the  turn,  the  abdomen  seems  to  play  a  stronger  role  than  the  wings  in 
generating  the  flight  trajectory.  For  the  data  shown  in  Figure  10,  we  compute  the  flapping  velocities  of  the 
wing  tip  and  the  abdomen  tip.  The  components  of  these  flapping  velocities  along  the  three  inertial  axes  are 
shown  in  Figure  11(a).  It  is  seen  that  during  the  initial  acceleration  phase,  there  is  significant  component 
of  the  flapping  velocity  along  the  X-axis,  which  is  the  direction  of  flight  of  the  insect;  and  the  same  then 
reduces  during  the  deceleration  phase.  The  flapping  velocity  component  of  the  abdomen,  along  the  X-axis 
does  not  seem  to  be  affected  by  the  fact  that  the  insect  is  accelerating  or  decelerating;  but  it  does  show  the 
same  periodicity  as  the  wing  and  consistently  remains  in  a  phase  opposite  to  that  of  the  wing.  During  the 
turn,  the  X  and  Y  axes  flapping  velocity  components  of  the  abdomen  become  almost  comparable  to  that  of 
the  wing,  thus  indicating  the  strong  role  that  the  abdomen  plays  during  a  turn.  Also,  along  the  Y  axes,  the 
flapping  velocity  of  the  abdomen  initially  increases  in  phase  with  the  wing,  but  this  then  gets  disrupted  till 
the  occurrence  of  the  turn,  during  which  period,  the  abdomen  tip  velocity  becomes  nearly  opposite  in  phase 
to  that  of  the  wing.  Figure  11(b)  then  shows  the  X  axis  components  of  the  flapping  velocities  of  the  wingtip 
and  abdomen  tip  on  a  phase  plane.  The  different  flight  phases  are  identified  by  different  colors  on  this  plot. 

We  can  then  take  the  scalar  sums  of  the  individual  velocity  components  shown  in  Figure  11(a)  to  plot  the 
flapping  speeds  of  the  wings  and  the  abdomen.  This  is  shown  in  Figure  12(a).  In  this  figure,  we  see  that  the 
wing  flapping  speed  is  significantly  higher  than  that  of  the  abdomen  in  all  phases  of  flight,  except  during  the 
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Figure  9.  Vertical  axis  trajectories  of  different  body  parts  of  the  insect. 


turn.  During  the  middle  of  the  turn,  the  amplitude  of  the  peak  of  the  flapping  speed  is  almost  exactly  equal 
to  that  of  the  wing.  Yet  by  adjusting  the  relative  phases  of  the  two  flapping  speeds,  the  insect  ensures  that 
the  ratio  of  abdomen  tip  flapping  speed  to  the  wing  tip  flapping  speed  is  in  excess  of  unity,  during  the  turn. 
Note  that  during  the  initial  acceleration  and  deceleration  phases,  the  abdomen  flapping  speed  varies  almost 
in  phase  with  the  wing  flapping  speed;  and  it  is  just  at  the  commencement  of,  and  during  the  turn  that  this 
relative  phase  pattern  gets  disrupted.  During  the  second  half  of  the  turn,  there  is  an  almost  constant  phase 
lag  of  the  abdomen  tip  flapping  speed  in  relation  to  the  wing  tip  flapping  speed.  Figure  12(b)  then  shows 
the  ratio  of  the  abdomen  tip  flapping  speed  to  the  wingtip  flapping  speed  during  the  different  flight  phases. 
It  is  clearly  seen  that  this  ratio  becomes  significantly  high  during  the  turn  thus  demonstrating  the  major 
role  that  the  abdomen  appears  to  play  during  the  turning  flight  phase,  at  least  as  far  as  this  particular  ratio 
metric  is  concerned. 

From  the  data  of  Figure  7  and  assuming  a  nominal  mass  of  0.35  grams,  we  get  the  inertial  forces  acting 
on  the  insect.  These  are  given  in  Figure  13,  from  which  we  see  that  the  magnitudes  of  these  forces  are  of 
the  order  of  0.01  Newtons.  Figure  14(a)  shows  a  comparison  of  the  Reynolds  Number  of  the  wing  tip  and 
the  abdomen  tip  through  the  different  flight  phases.  For  this  computation,  the  characteristic  length  of  the 
abdomen  was  taken  as  the  abdomen  length  itself  while  for  the  wing,  the  characteristic  length  was  taken  as 
the  mean  aerodynamic  chord  of  the  wing.  Typical  values  of  these  quantities  for  the  species  being  considered 
are  0.02912  meters  and  0.03844  meters,  respectively.  The  diameter  of  the  abdomen  is  typically  0.00485 
meters.  The  Reynolds  Number  of  the  wing  tip  ranges  from  2000  to  about  15000,  while  that  of  the  abdomen 
tip  ranges  from  2000  to  about  8000.  The  Reynolds  Number  of  the  abdomen  is  generally  lower  than  that 
of  the  wing,  except  during  the  acceleration  and  the  turn  phases.  During  these  two  phases,  the  Reynolds 
number  of  the  abdomen  tip  is  comparable  to  that  of  the  wing  tip.  Figure  14(b)  shows  the  advance  ratio  of 
the  wing. 

There  is  significant  wing  cambering  activity  as  the  butterfly  prepares  itself  for  the  deceleration.  This 
is  demonstrated  in  Figure  15.  To  generate  the  plots  in  this  figure,  three  representative  points  on  the  wing 
chord  were  tracked  -  one  on  the  leading  edge,  a  second  on  the  trailing  edge  and  a  third  in  between  the  leading 
and  trailing  edges.  In  the  frames  when  these  three  points  are  found  to  be  nearly  collinear,  it  is  construed 
that  there  is  negligible  wing  cambering,  while  at  other  times  it  is  construed  that  there  is  significant  wing 
camber.  The  wing  cambering  is  demonstrated  through  a  succession  of  frames,  with  the  abdomen  root  being 
positioned  at  the  origin  of  each  of  these  Figures  15(a-d).  In  these  figures,  the  color  sequence  is  as  follows: 
the  first  frame  in  each  figure  is  represented  in  blue,  followed  by  green,  red,  cyan  and  magneta.  Also  plotted 
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Trajectories  of  wing  tip  and  abdomen  tip  relative  to  the  abdomen  root  (on  a  vertical  plane) 


Figure  10.  Trajectories  of  left  wing  tip  and  abdomen  tip  shown  relative  to  the  abdomen  root  on  a  vertical  plane.  The 
different  flight  phases  are  identified  by  different  colors. 


for  the  purposes  of  comparison,  is  the  wing  tip  flapping  velocity  vector.  From  Figure  15(a),  it  is  seen  that  in 
frame  66,  the  wing  has  negligible  camber  and  then  progressively  gets  cambered  as  it  executes  a  clownstroke. 
Figure  15(b)  shows  the  continued  presence  of  the  camber  during  the  wing  upstroke  motion.  Some  wing 
camber  continues  to  be  present  further  along  the  upstroke;  and  after  the  wing  crosses  the  abdomen  root, 
this  camber  then  begins  to  reduce  (as  seen  in  Figure  15(c-d)). 

The  next  relevant  example  of  in-flight  structural  adaptation  is  presented  with  a  glide  flight  with  very 
mild  flapping  activity  (flight  number  0330080205).  Interestingly  enough  the  event  displays  an  in-flight  wings 
twisting  and  change  of  dihedral  (on  the  single  wing)  with  probably  no  inertia  loads  from  flapping  motion 
on  the  wing.  This  fact  appears  quite  remarkable  due  to  the  absence  of  any  muscles  in  the  butterfly’s  wing. 
Figure  17  illustrates  a  three-dimensional  plot  of  the  trajectory  of  two  chord-wise  sections  on  each  wing  (right 
wing  blue,  left  wing  green).  The  absence  of  significant  flapping  and  the  presence  of  twisting  activity  are 
both  evident.  Again  the  abdomen  (green  line)  is  probably  used  to  dynamically  adapt  the  center-of-gravity 
position  to  the  new  flight  requirements  (the  orientation  changes  at  low  rates). 


Figure  11.  (a)  Wing  tip  and  abdomen  tip  flapping  velocities  along  the  X,Y  and  Z  axes,  (b)  Phase  plane  plot  of  wingtip 
and  abdomen  tip  flapping  velocities  along  the  X  axis. 
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Figure  12.  (a)  Comparison  of  wing  tip  and  abdomen  tip  flapping  velocities  during  different  flight  phases,  (b)  Ratio  of 
abdomen  tip  flapping  speed  to  wing  tip  flapping  speed. 


Forces  on  the  abdomen  root  (X,Y  and  Z  axes) 


. . 

0.25  0.3  0.35  0.4  0.45  0.5  0.55  0.6 

Time  (sec) 


Figure  13.  Inertial  force  components  on  the  insect 


IV.  Conclusions 

This  paper  discusses  the  collection  and  analysis  of  free  flight  data  of  butterflies  in  their  natural  environ¬ 
ment.  A  particular  flight  with  several  rapid  changes  in  flight  phase  is  evaluated,  with  the  objective  being  to 
determine  the  manner  in  which  the  insect  adapts  the  motion  of  its  abdomen  to  that  of  its  wings;  and  also 
to  determine  how  the  manner  of  this  adaptation  changes  from  one  flight  phase  to  the  next.  Instances  of  the 
insect  adapting  its  wing  shape  in  an  aeroelastic  manner  are  also  demonstrated.  Future  work  will  comprise 
the  use  of  sophisticated  mathematical  tools  to  perform  a  deeper  analysis  of  this  adaptive  behavior. 
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Figure  15.  Demonstration  of  wing  cambering  activity  as  the  butterfly  prepares  for  deceleration. 


Figure  16.  (a)  Snapshot  frame  demonstrating  insect  flight  with  wing  dihedral,  (b)  Snapshot  frame  demonstrating 

insect  flight  with  no  wing  dihedral. 


12  of  13 


American  Institute  of  Aeronautics  and  Astronautics 


Oiqiit  is  blue.M  v«!9 1*  »cc  an  i-ii?  audcnier.  is  :yee-:i 


Figure  17.  Sequence  during  a  steady  descending  glide  with  little  flapping  activity.  The  segments  represent  the  right 
wing  chord  at  mid-wing  (blue),  the  left  wing  chord  (red)  and  abdomen  (green). 
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Abstract —  Aeroelastic  wing  micro-autonomous  aerial  systems 
(MAAS)  concepts  are  being  explored  for  military  and  civilian 
applications.  However,  on  the  whole,  the  issues  of  control 
of  MAAS  are  largely  unexplored.  Controllers  designed  using 
methods  applicable  to  larger  aircraft  are  unlikely  to  realize  the 
agile  flight  potential  of  flexible  wing  MAAS  airframes.  In  this 
paper,  the  authors  use  two  Euler-Bernoulli  beams  connected 
to  a  rigid  mass  to  model  an  aeroelastic  wing  MAAS.  They 
employ  Continuous  Sensitivity  Equation  Methods  to  examine 
the  sensitivity  of  the  controlled  state  with  respect  to  variation 
of  the  H0 0  control  parameter,  with  the  primary  goal  being  to 
gain  insight  into  the  flexible  dynamics  of  the  system  in  order  to 
exploit  the  flexibility  for  control  purposes.  Further,  the  authors 
examine  functional  gains  in  order  to  determine  optimal  sensor 
placement  while  taking  advantage  of  the  flexibility  of  the  MAAS 
model. 

I.  INTRODUCTION 

Considerable  work  is  currently  underway  to  investigate 
the  aerodynamics,  structural  dynamics,  flight  mechanics,  and 
control  associated  with  bio-inspired  flight  (see  for  example 
[1],  [2],  [3],  [4],  [5]).  Consequently,  aeroelastic  wing  micro- 
autonomous  aerial  systems  (MAAS)  concepts  are  being 
explored  for  military  and  civilian  applications.  Work  from 
other  projects  (see  for  example  [6],  [7],  [8],  [9])  is  laying 
the  foundation  required  to  eventually  construct  high  fidelity 
dynamics  models  of  MAAS,  which  do  not  currently  exist, 
though  key  features  of  such  models  are  emerging.  However, 
on  the  whole  the  issues  of  control  of  agile  aeroelastic 
wing  MAAS  are  largely  unexplored.  All  micro-scale  vehicles 
developed  to  date  exhibit  only  limited  autonomy,  generally 
way-point  trajectory  following,  with  limited  agility. 

In  this  paper,  the  authors  use  two  Euler-Bernoulli  beams 
connected  to  a  rigid  mass  in  an  initial  effort  to  model  an 
aeroelastic  wing  MAAS.  Each  beam  represents  a  flexible 
wing,  while  the  rigid  mass  represents  the  fuselage.  This 
“beam-mass-beam”  model  will  be  referred  to  as  the  BMB 
model  system  in  this  paper.  The  authors  employ  Continuous 
Sensitivity  Equation  Methods  to  examine  the  sensitivity  of 
the  controlled  state  with  respect  to  variation  of  the  H„,  control 
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parameter,  with  the  goal  being  to  gain  insight  into  the  flexible 
dynamics  of  the  system  in  order  to  exploit  the  flexibility 
for  control  design  purposes.  A  secondary  goal  of  this  aspect 
of  the  research  is  to  explore  the  possibility  of  determining 
an  efficient  assignment  of  the  Hoc  control  parameter  that  is 
mathematically  justified  and  does  not  require  an  iterative 
procedure  for  determination.  Chen  identified  the  problem  of 
finding  an  optimal  value  for  the  control  design  parameter 
as  an  unsolved  problem  in  systems  and  control  theory  [10]. 
Further,  the  authors  examine  both  controller  and  observer 
functional  gains  in  order  to  obtain  insight  into  the  problem 
of  optimal  actuator  and  sensor  placement  for  MAAS  systems. 
The  approaches  explored  numerically  in  this  paper  seek  to 
take  advantage  of  the  flexibility  of  aeroelastic  wings  and 
exploit  the  same  to  achieve  agility  as  opposed  to  viewing 
this  characteristic  as  a  hindrance  for  control  design. 

The  outline  of  the  paper  is  as  follows.  The  H„  controller  is 
summarized  in  Section  II.  Section  III  provides  a  description 
of  the  equations  governing  the  partial  differential  equation 
(PDE)  model,  along  with  the  variational  forms  of  the  PDE 
equations  and  state  sensitivity.  Numerical  results  are  pre¬ 
sented  in  Section  IV.  Conclusions  and  directions  for  future 
work  are  given  in  Section  V. 

II.  Hoo  CONTROL  DESIGN 

In  this  section,  the  authors  present  a  short  overview  of 
the  Hoo  compensator  design  in  state  space  form  [11],  [12], 
Assume  the  existence  of  a  linear  PDE  system  of  the  form 

x(t)  =  &fx(t)  +£3u(t),  x(0)  =  xo,  (1) 

where  x(t)  =x(t,-)  £  X  is  the  state  of  the  linear  system  and  X 
is  a  Hilbert  space.  Here,  v/  is  the  system  operator  defined  on 
!)(///)  C  X  that,  by  assumption,  generates  an  exponentially 
stable  Co  semigroup,  38  is  the  control  operator,  and  u(t)  is 
the  control  input,  defined  on  Hilbert  space  U ,  which  is  taken 
to  be  ]Rm  in  this  work.  It  is  assumed  that  knowledge  of 
only  part  of  the  system  can  be  obtained  through  the  state 
measurement  y  on  Hilbert  space  Y ,  which  is  taken  to  be  DU' 
in  this  work,  where  y(t)  =  c£x(t).  Assume  an  estimate  of  the 
state  is  used  in  the  control  law.  To  provide  this  estimate,  a 
compensator  is  used  that  has  the  form 

xc{t)  =  g/cxc(t)  +  & y(t ),  xc(0)  =  xCQ  (2) 

and  the  feedback  control  law  is  written 

u(t)  =  —Jffxcit)  (3) 

where  xc{t)  =xc(t,-)  €  A  is  the  state  estimate.  Designing  a 
controller  of  this  type  requires  determining  .«(■, and  J(T. 


By  solving  the  Riccati  equations 

£/*n + m/ - n(^/r 1  -  e2,W)n + <rv  =  o,  (4) 

where  R  :  U  — >  U  is  a  weighting  operator  for  the  control  of 
the  form  R  =  cl,  with  c  a  scalar  and  I  the  identity  operator, 
and 

-  P(^Y-  eVY)f  +  JJ*  =  0,  (5) 

one  can  obtain  the  operators  X,  & ,  and  g£c  via 

X  =  R-lXU, 

&  =  (i-o2pn)-lptf*, 

,e/c  =  -  28 X  -  +  d2@Xn.  (6) 

The  resulting  feedback  control  is  applied  to  the  original  linear 
system;  the  closed  loop  linear  system  is  then  defined  by 


d 

x{t) 

-28X  ' 

x{t) 

dt 

_  xc(t ) 

K 

.  xc{t) 

For  sufficiently  small  6,  there  are  guaranteed  minimal  so¬ 
lutions  II  and  P  to  (4)  and  (5),  respectively,  such  that 
( I  —  62PU)  is  positive  definite  and  the  linear  closed  loop 
system  (7)  is  stable.  Note  that  6=0  yields  the  classical 
Linear  Quadratic  Gaussian  (LQG)  compensator  design.  Since 
there  exist  no  prescribed  formulas  for  6,  there  is  an  inherent 
computational  expense  for  this  control  design  in  choosing 
the  parameter  value.  As  a  secondary  goal,  the  authors  seek 
to  use  sensitivity  analysis  to  gain  a  better  understanding 
of  the  Hoo  controller.  The  goal  is  to  develop  a  methodol¬ 
ogy  for  choosing  6  to  satisfy  performance  and  robustness 
criteria,  while  justifying  that  choice  based  on  the  analysis. 
To  this  end,  sensitivity  analysis  is  applied  to  H „  controlled 
distributed  parameter  systems  to  examine  the  sensitivity  of 
the  controlled  state  to  9. 

For  certain  PDEs,  the  control  law  in  (3)  can  be  written  in 
integral  form.  That  is, 

u{t)  =  -Xxc(t)  =  -(ki(s),xc(t)}x,  (8) 


for  spatial  variable  s  and  where  k,  g  X  for  i  =  1,2 
(see  for  example  [13]),  and  the  kernels  of  the  integrals, 
kj(s),  are  called  control  functional  gains.  Control  functional 
gains  can  be  used  to  determine  optimal  sensor  placement 
(see  for  example  [14],  [15],  [16],  [17])  because  they  provide 
information  about  the  contribution  of  the  state  estimate  to 
the  overall  controller.  For  example,  an  area  where  a  control 
functional  gain  is  large  would  indicate  that  area  provides  a 
state  estimate  value  that  contributes  more  significantly  to  the 
controller.  Further,  there  would  be  potential  benefit  in  placing 
sensors  in  that  area. 

Additionally,  the  observer  gain  operator  JF  :  Rp  — >•  X  is 
continuous  and  has  range  in  D(^/)  C  X.  Then,  for  a  state 
estimate  of  the  form  xc(t)  =  [wc(t ,  •)  &  has  the 

representation 


hi{s) 


8p(s) 

hp(s) 


y  i 

.  yp . 


£  X , 


where  gi(s), . . .  ,gp(s),hi(s), . . .  ,hp(s)  are  called  observer 
functional  gains.  To  more  completely  analyze  the  problem 
of  sensor  placement,  observer  functional  gains  should  be 
examined  alongside  control  functional  gains.  For  example, 
an  area  where  an  observer  functional  gain  is  large  would 
indicate  that  area  provides  a  measurement  value  of  the  state 
that  contributes  more  significantly  to  the  overall  controller 
design.  Thus,  using  similar  logic  applied  in  the  case  of 
control  functional  gains,  there  would  be  potential  benefit  in 
placing  sensors  in  that  area. 

As  documented  in  [14],  this  simple  approach  to  sen¬ 
sor  placement  does  not  take  into  account  issues  such  as 
performance  and  robustness.  However,  given  the  complex 
nature  and  relative  lack  of  understanding  of  aeroelastic  wing 
MAAS,  it  is  reasonable  to  examine  the  functional  gains  in 
this  problem  as  initial  work  toward  the  direction  of  designing 
sensors  for  these  aircraft. 


III.  AN  AIRCRAFT-INSPIRED  MODEL 


In  this  work  two  Euler-Bernoulli  beams  connected  on 
either  side  of  a  rigid  mass  are  used  to  model  an  aeroelastic 
wing  MAAS,  hereafter  referred  to  as  the  BMB  system.  The 
fuselage  of  the  MAAS  is  assumed  to  be  rigid.  A  schematic 
of  the  BMB  system  is  given  in  Figure  1 .  Note  that  the  BMB 
system  is  meant  to  represent  primarily  the  heave  dynamics 
of  the  MAAS.  The  MAAS  is  initially  assumed  to  be  flying 
with  wings  straight  and  level  and  in  equilibrium  with  the  lift 
balancing  the  weight.  At  time  t  =  0,  there  is  assumed  to  be 
a  perturbation  in  the  wings’  shape  (caused  by  a  sudden  gust, 
for  example).  This  perturbed  wing  shape  causes  a  change  in 
the  local  angle  of  attack  distribution  over  each  wing  and  this 
in  turn  leads  to  a  perturbation  in  the  lift  distribution  denoted 
by  ALift(f,.y).  Each  beam  is  modeled  with  both  viscous  and 
Kelvin- Voigt  damping,  and  it  is  assumed  that  the  material 
and  inertial  properties  of  both  beams  are  homogenous  and 
identical.  Denoting  the  displacement  of  the  left  beam  from  its 
initial  equilibrium  position  at  time  t  and  position  s  by  wpU ,s) 
and  the  corresponding  displacement  of  the  right  beam  at  time 
t  and  position  s  by  \vp{t,s),  the  model  of  the  BMB  system 
is  described  as  follows: 


d2  d 4  d 

Pa-fr2WL(t’S)+EIJ^4WL{t,s)+'Y1—WL(t,S) 

d5  .  —  ALift(f,s) 

+YlIJid7WL^'^  =  — 1/2 — 

GO) 

for  0  <  s  <  £/2,  t  >  0,  and 

d2  <94  d 

Pa-^wR{t,s)+EI-^wR(t,s)  +  y1-^wR(t,s) 

d5  ,  —  ALift(f,s) 

+Y2lJrd7WR^t,S^  = - IJl - + 


(9) 


(11) 


for  t/2  <  s  <£,  t  >  0,  subject  to  boundary  conditions 


d2 


d 3 


EI—WL(tl0)  +  y2I—WL(t,0)=0, 

EI-^WL(t’°)  +  Y2I-^^wL(t,0)  =  0, 


d 3 

EI-^wR(tJ)+  y2I  w«  (t ,  £)  =  ° , 


£7 


i! 

<9 3 


<94 


^3WS(M)+ft/^3tVK(M)  =  0 


a3 


a4 


£/ ~d^ wdt^/2)  +  wL(f ,  £/2) 

<93  <94 

-EI-^wR(t,£/2)-Y2Ij^wR(t,£/2) 

d2 

=  m^wL{t,i/2), 

wL(t,£/2)  =  wR(t,i/2), 

^-wL(t,£/ 2)  =  ^-wR(t,£/2), 


d2 


d 3 


El  -^2 wL{t,e/2 )  +  y2/  ^wL(t//2) 
~EI§^WR(t,i/2^~Y2lj!ds2  WR^^/2) 


=  /- 


d3 

<9r2«9. 


■wR(t,£/2), 


(12) 


where  p  is  the  density  of  the  beam  material,  a  is  the  cross- 
sectional  area  of  the  beam,  £  is  Young’s  modulus,  /  is  the 
area  moment  of  inertia  of  the  beam,  Iz  is  the  mass  moment 
of  inertia  of  the  rigid  mass,  yi  is  the  coefficient  of  viscous 
damping,  y2  is  the  coefficient  of  Kelvin- Voigt  damping,  m 
is  the  mass  of  the  rigid  connection  between  the  beams, 
bR(s )  is  the  control  input  function  for  the  left  beam,  u/J.t) 
is  the  controller  for  the  left  beam,  bR(s)  is  the  control  input 
function  for  the  right  beam,  uR(t)  is  the  controller  for  the 
right  beam,  and  ALift  (V ,  .v)  is  the  function  representing  the 
perturbed  lift  force  on  each  of  the  beams.  In  this  work,  I,  is 
taken  to  be  zero,  so  the  simulated  BMB  system  is  actually 
a  free-free  beam  with  a  point  load  in  the  center. 


Fig.  1.  MAAS  model  system. 


Sensed  information  is  used  to  design  a  feedback  controller 
that  regulates  the  MAAS  model  system  to  the  exponentially 
stable  zero  equilibrium.  It  is  assumed  that  the  controllers  act 
over  the  entire  beam  structures  with  control  input  functions 
of  the  form 


for  0  <  ,v  <  /',  and  available  observations  taking  the  form 

y(f)=0.25vt’M,  (14) 


for  0  <  s  <  i. 


A.  Variational  Form  and  Discretization  of  BMB  System 

Now  consider  the  variational  form  of  the  BMB  system  in 
order  to  develop  a  Galerkin  finite  element  approximation  of 
the  problem.  For  brevity,  only  the  weak  formulation  of  the 
left  beam  will  be  presented;  the  formulation  for  the  right 
beam  follows  similarly.  Employing  the  shorthand  notation 
w(t,s)  =  ^w(t,s)  and  w'(t,s)  =  for  this  discussion, 

the  variational  problem  is  that  one  seeks  a  wl{s)  £  V  = 
(<p(-)  £  E}  C  E  =  //2( 0,7/2)  such  that  for  all  <p  £  V 


rl/2  rt/Z 

/  pawL(t,s)cp(s)  ds+  EIw'"'  {t  ,s)tp{s)  ds+ 

Jo  Jo 

r(/2  ft/ 2 

/  7i  wL(t,s)(p(s)  ds+  Y^Iw'l  (t ,s)(p(s)  ds 

Jo  Jo 

f1!2  -ALift(f,s)  fe/2 

=  Jo  - Tn - (P(s)ds  +  JQ  bL(s)<p(s)uL{t)  ds. 


(15) 


Now  choose  a  basis  {bj}f=1  for  the  approximating  space 
VN  C  V,  where  N  corresponds  to  the  number  of  gridpoints 
used  in  the  finite  element  approximation.  In  particular,  since 
VN  C  V  C  E  =  //2  (0,7/2),  the  state  can  be  approximated 
by  a  linear  combination  of  cubic  splines.  Then  the  state  is 
approximated  as 

N 

wL(t,s)^iw^(t,s)  =  YJCi{t)bi(s).  (16) 

i=i 

Using  the  state  approximation  (16)  in  (15)  yields  the  matrix 
equation 

M0c(t)+D0c(t)+K0c(t)  =  F0(c(t))  +  B0uL(t)  (17) 

where  c(f)  =  [ci(f), . . .  ,Cjv(f)]r,  Mq  is  the  mass  matrix.  Do 
is  the  damping  matrix,  Kq  is  the  stiffness  matrix,  £o(c(f)) 
contains  the  lift  function,  and  Bo  is  the  input  matrix,  all 
defined  by  the  following,  for  i,j  =  1, . . .  ,N: 

ft/ 2 

[M0 ]ij  =  j  pabi(s)bj(s )  ds  +  mbi(£/2)bj(£/2) 

-Ib'fWjii/2), 

fe/2 

lDo\ij  =  Jo  7i bj(s)bj(s)  ds 


re/2 

'  /  ” 

Jo 


Yilbi  (s)bj(s)  ds, 

rt/2 


(18) 


lK°h  =  J0  EIb'i(s)b'j(s)  ds, 
fa(c(0)];  =  /o 


^/2  — ALift(/,s) 


e/2 

,r/2 

[B0]j  =  J ^  b(s)bj(s)  ds. 


bj(s)  ds, 


b(s)  =  bL(s)  =  bR(s)  =  0.5 


(13) 


Convert  (17)  into  a  first  order  system  by  defining  x\  ( t )  =  c(t) 
and  x-2 (t)  =  x\  (t)  =  c(t),  thereby  yielding 


Xl{t) 

*5? 

_ 1 

+ 


0  I 

-M^Ko  -M0lD0 

0 


xi  (t) 
x2(t) 


Sxl  (0 

_  Sja(t)  _ 

MolF0(w(t))  J  ’ 


(19) 


0 


M0lB0  \ 


(0- 


(23) 


where  x=  [xi(f),X2(f)]r  =  x\(t),  £xi(t)] T.  Note  that  (19) 


s.yI (t)  =  sc(t)  and  sx2(t)  =  sxi{t)  =  sc{t),  thereby  yielding 

0  I  1  I"  .5*1  (f) 

-Mq1K0  —Mq1Dq  \  [  S_x2  {t ) 

°  1 

Mq-'fiKM.MO)  J  ’ 

where  s x  =  [sxi(t),sx2(t)]T  =  [sxi{t),£sxl{t)]T .  Combining 
(19)  and  (23)  yields  the  coupled  system 


is  a  finite-dimensional  approximation  of  the  system  in  (1). 
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B.  Variational  Form  and  Discretization  of  Sensitivity  Equa¬ 

7rl  (0 
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t>x  1  (f ) 

tion  for  BMB  System 

.  sx2(t)  _ 

tc 

0 

0 

.  Sx2  (t) 

This  framework  now  provides  the  basis  for  implementing 
control  techniques  discussed  in  Section  II.  Beyond  control 
design,  the  authors  are  interested  in  examining  the  effects 
of  the  Ho o  control  parameter,  0,  on  the  displacement  of  the 
beams  and  the  controller  itself.  The  dependence  of  these 
quantities  on  0  is  denoted  explicitly  with  the  following  nota¬ 
tion:  wl(Cs)  =Wi(f,s;0)  and  iiiit)  =  ml(c0),  respectively. 
Continuous  Sensitivity  Equation  Methods  are  employed  for 
examining  the  sensitivities  of  these  quantities  to  changes 
in  the  value  of  0  used  in  the  Hoo  control  design.  Make 
the  following  definitions  for  the  sensitivities:  sWL(t,s\0)  = 
jgWL(t,s',d)  for  the  sensitivity  of  beam  displacement  with 
respect  to  0  at  time  t  and  spatial  location  s  and  sUL  (f;  0)  = 
jgUL{t:d)  for  the  sensitivity  of  the  controller  with  respect 
to  0  at  time  t. 

Now  derive  the  variational  form  of  the  sensitivity  equation 
by  differentiating  (10)  with  respect  to  0.  One  seeks  a  117/, y)  £ 
V  =  |<p(-)  £  E}  C  E  =  H2( 0,£/2)  such  that  for  all  (p  £  V 


dl  2  d/2 

J  pasWL(t,s)(p(s )  ds  +  EIs'"'L{t ,s)(p{s)  ds+ 

rt/2  M2 

/  yisWL(t,s)(p{s)  ds+  y2Is'”'L(t ,s)(p(s)  ds 
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sWL(t,s)q>(s)  ds+ 


rt/2 

/  bL(s)(p{s)sUL(t)  ds. 
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+ 
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_  Hi  su(t) 
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(24) 


where  I  is  the  identity  operator  and 

Hi  =  - M0lK0 ,  H2  =  - M0lD0 ,  Hj,  =M()1Bq 

,  ,  (25) 

H4=M0lF0(w(t)),  H5=M^lFi(w(t),sw(t))- 

Now,  (24)  is  a  finite-dimensional  approximation  to  a  system 
similar  to  the  form  of  (1),  where  the  additional  terms  appear 
due  to  the  coupled  sensitivity  equation.  One  can  replace  the 
control  uL(t)  in  (24)  by  the  full  state  feedback  control  law 

ml(1;0)  =  —  Jfx(t;0)  =  —  Jf[xi(t)  X2(f)]r.  (26) 

Furthermore,  one  can  differentiate  (26)  with  respect  to  0  to 
compute  sUL(t\0)  as  follows 


de 

=  -R 


—  1  rf2>* 
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dx(t\6) 

de 
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—  1  ^21* 


§*■■•) <27) 
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=  -JfsWL(t;d)-R  ' M*  —  wL(t;0) 


de 


where  the  sensitivity  of  II  with  respect  to  0,  is  computed 
by  differentiating  (4)  with  respect  to  0  and  solving  a  resulting 
Lyapunov  equation  [18],  [19], 

IV.  NUMERICAL  RESULTS 


Choose  the  same  basis  {bj}f=l  for  the  approximating  space 
l/;V  C  V  as  was  used  in  the  state  approximation.  Then  the 
state  sensitivity  is  approximated  as 

N 

sWL(t,s;e)&s%L(t,s;e)  =  ^sc.(t)b,(s),  (21) 

i=  1 

and  a  finite  dimensional  approximation  of  (20)  can  be 
rewritten  as  a  matrix  equation 

MQsc(t)+D0sc(t)+K0sc(t)  =  Fi(c(t),sc(t))+B0su(f,6),  (22) 

where  sc(t)  =  [sci  (f), . . .  ,scN(t)]T ,  M0,  D0,  K0,  and  B0  are 
defined  in  (18),  and  Fi(c(t),sc(t))  is  based  upon  the  lift 
function.  Convert  (22)  into  a  first  order  system  by  defining 


To  obtain  a  solution  to  the  system  in  (24),  initial  conditions 
are  chosen  of  the  form:  x\  (0)  =  sin  (^f),  x2{e)  =  fcos(^), 
sxi (0)  =0.75*xi(0),  and  .5*2(0)  =0.75*X2(0).  That  is,  to 
generate  a  nonzero  state  sensitivity,  the  authors  choose  the 
initial  conditions  for  the  sensitivity  equation  to  be  75%  of 
the  initial  conditions  for  the  state  equation.  A  finite  element 
approximation  using  Hermite  interpolating  cubic  splines  of 
order  N  =  20  for  the  spatial  discretization  of  each  beam  is 
employed  to  simulate  (24),  and  the  parameter  values  for  the 
BMB  system  are  as  follows:  £  =  10  m,  p  =  5.24  kg/m3,  m> 
(width)  =  1  /v7^  m,  h  (height)  =  1  /\/48  m,  a  =  wh  =  1  /48 
m2,  E  =  1.44  x  109  N/m2,  I  =  1/1327104  m2,  m  =  5  kg, 
71  =  0.025  kg/(m  s),  72  =  1  x  104  kg/(m5  sec).  Originally, 
standard  4  degree  of  freedom  beam  elements  were  selected 


for  the  finite  element  approximation,  where  the  degrees 
of  freedom  correspond  to  displacements  and  slopes  at  the 
endpoints  of  each  beam  element  (see  for  example  [20]). 
However,  due  to  numerical  instabilities  in  solving  the  finite 
dimensional  approximations  to  (4)  and  (5)  with  the  4  degree 
of  freedom  scheme,  an  approximation  using  2  degrees  of 
freedom,  displacements  at  the  end  of  each  beam  element, 
was  developed.  Numerical  results  from  this  approximation 
scheme  are  presented  in  this  paper. 

For  this  discretization  and  set  of  parameter  values,  it  was 
found  that  the  largest  possible  H,„  controller  parameter  8 
that  will  guarantee  (/  —  02PI1)  being  positive  definite  is 
0.38.  Therefore,  all  //„  controllers  implemented  in  this  paper 
use  8  =  0.38.  Still,  the  reader  is  reminded  of  the  interest 
in  examining  the  sensitivity  of  the  state  with  respect  to  8 
variation.  In  this  work,  the  lift  function  is  neglected,  but  it  is 
included  in  the  written  statement  of  the  model  and  relevant 
weak  formulations  since,  ultimately,  it  is  the  intent  that  the 
BMB  system  will  closely  model  a  MAAS  system. 

Approximate  state  and  state  sensitivities  to  8  are  computed 
for  several  values  of  the  parameter,  namely  8  =  0.00  (LQG 
compensator),  8  =  0.10,  8  =  0.20,  and  8  =  0.38.  For  refer¬ 
ence,  the  uncontrolled  state  plot  is  given  in  Figure  2.  It  is  the 
intent  to  design  a  feedback  controller  that  will  stabilize  the 
unstable  uncontrolled  system.  The  primary  question  of  inter- 

Position,  Uncontrolled  System 


Fig.  2.  Uncontrolled  Position  State 

est  in  this  paper  is  how  to  take  advantage  of  the  aeroelastic 
wing  feature  of  a  MAAS  to  aid  in  control  design  efforts.  A 
secondary  goal  is  to  examine  how  sensitive  the  controlled 
beam  displacements  are  to  variation  in  the  H x  control 
parameter,  8.  Figure  3  contains  plots  of  the  state  sensitivities 
to  the  8  parameter.  As  can  be  seen  from  these  simulations, 
the  state  sensitivities  for  the  various  8  values  depicted  are 
virtually  indistinguishable.  This  observation  suggests  that  for 
the  BMB  system  with  the  chosen  parameters,  the  actual  8 
value  used  in  control  design  may  not  be  critical  in  regard  to 
controlled  state  performance. 

Additionally,  the  authors  examine  su(t;8),  the  sensitivity 
of  the  controller  with  respect  to  8 ,  and  these  plots  are 
found  in  Figure  4.  The  results  demonstrate  that  the  controller 
becomes  more  sensitive  to  8  as  this  parameter  is  increased. 
Since  the  value  of  8  is  closely  connected  to  the  robustness  of 
the  controller,  this  observation  suggests  that  the  more  robust 
the  controller,  the  more  sensitive  it  is  to  8. 

As  a  means  to  gain  insight  into  the  problem  of  sensor 
placement,  the  authors  examine  the  control  and  observer 


Fig.  3.  State  Sensitivities:  9  =  0.00  (top  left),  9  =  0.10  (top  right),  9  =  0.20 
(bottom  left),  9  =  0.38  (bottom  right) 


Fig.  4.  Controller  Sensitivities:  9  =  0.00  (top  left),  9  =  0.10  (top  right), 
9  =  0.20  (bottom  left),  9  =  0.38  (bottom  right) 


functional  gains,  contained  in  Figures  5  and  6,  respectively. 
An  area  where  a  functional  gain  is  large  indicates  that  one 
should  consider  placing  a  sensor  in  that  region  of  the  spatial 
domain  since  it  appears  to  contribute  significantly  to  the 
control  design.  Due  to  the  small  scale  of  the  control  bending 
gains  in  Figure  5,  there  is  no  useful  information  to  ascertain 
from  this  plot.  The  control  velocity  gains  suggest  that  sensors 
be  placed  at  the  free  ends  of  the  beams.  The  observer  gains 
in  Figure  6  are  nearly  constant  so  that  there  is  no  useful 
information  to  ascertain  from  this  plot.  It  should  be  noted  that 
there  may  be  a  problem  with  convergence  of  the  functional 
gains,  as  can  be  seen  from  the  plots.  Normally,  one  examines 
the  functional  gains  for  various  discretizations  with  increas¬ 
ing  N  to  verify  that  gain  convergence  has  been  achieved. 
However,  for  N  =  5  and  N  =  10  for  each  beam,  MATLAB® 
reported  that  the  Grammian  matrix  W  =  [Ko  0;0  Mq\  was 
nearly  singular  so  that  computation  of  W  ,  as  required 
for  gain  computation  (see  [13]),  may  not  be  accurate.  For 
this  reason,  and  the  fact  that  finer  discretizations  than  N  = 
40  on  each  beam  are  computationally  intractable  due  to 


the  cubic  spline  basis  required  for  Euler-Bernoulli  beam 
approximations,  only  gain  computations  for  N  =  20  and 
N  =  40  for  each  beam  are  shown. 


Fig.  5.  Control  Functional  Gains  for  9  =  0.38  with  N  =  20  (red)  and  N  =  40 
(black)  for  each  beam:  bending  gains  (left)  and  velocity  gains  (right) 

xioJ  Bending  Observer  Gains  Velocity  Observer  Gains 


Fig.  6.  Observer  Functional  Gains  for  6  =  0.38  with  N  =  20  (red)  and 
N  =  40  (black)  for  each  beam:  bending  gains  (left)  and  velocity  gains  (right) 

V.  CONCLUSIONS  AND  FUTURE  WORKS 

A.  Conclusions 

In  the  paper,  the  BMB  system  (10),  (11)  is  approximated 
by  Hermite  interpolating  cubic  splines  with  2  displacement 
degrees  of  freedom  for  each  beam  element.  Approximate 
state  and  state  sensitivities  to  8  are  computed  for  several 
values  of  the  parameter  8.  It  is  observed  that  the  state 
sensitivities  for  the  various  8  values  depicted  are  virtually 
indistinguishable.  This  suggests  that  for  the  BMB  system 
with  the  chosen  parameters,  the  actual  8  value  used  in 
control  design  may  not  be  critical  in  regard  to  controlled 
state  performance.  The  authors  also  examine  the  sensitivity 
of  the  controller  with  respect  to  8,  and  these  results  suggest 
that  the  more  robust  the  controller,  the  more  sensitive  it  is  to 
8.  As  a  means  to  gain  insight  into  the  problem  of  sensor 
placement,  the  authors  examine  the  control  and  observer 
functional  gains.  The  results  suggest  that  placing  sensors 
near  the  endpoints  of  the  free  ends  of  the  beams  may  prove 
advantageous  to  control  design. 

B.  Future  Works 

Numerical  instabilities  in  solving  the  finite  dimensional 
approximations  to  the  algebraic  Riccati  equations  were  dis¬ 
covered,  and  this  needs  to  be  investigated.  More  investigation 
needs  to  be  done  on  the  sensor  placement  problem  in  order 
to  take  into  account  sensor  placement  effects  on  performance 
and  robustness.  Instead  of  considering  a  point  load  between 
the  two  beams,  the  authors  are  interested  in  including  in  the 
BMB  model  a  mass  of  some  nonzero  size.  The  authors  plan 
to  include  a  realistic  aerodynamic  force  for  the  lift  function. 
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Issues  of  Scale  in  Agile  Micro  Autonomous  Systems 
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The  quest  for  micro  autonomous  systems  (MAS)  is  taking  us  from  the  realms  of  science 
and  engineering,  as  with  the  University  of  California  at  Berkeley  micro  mechanical  flying 
insect,  to  areas  that  would  have  been  the  realm  of  science  fiction  just  a  few  years  ago,  as  in 
Darpa’s  Nano  Air  Vehicle  program.  Emboldened  by  advances  in  micro-scale  technologies 
and  inspired  by  insight  into  the  mechanisms  associated  with  biological  locomotion,  eventual 
realization  of  bird  or  insect  size  autonomous  robots  seems  certain.  Among  the  many 
technical  challenges,  issues  associated  with  integration  of  MAS  into  complex  human-directed 
information  networks,  in  particular  issues  of  autonomous  sensory-response  architectures  for 
systems  with  multi-scale  dynamics,  may  prove  to  be  the  largest  hurdles.  This  paper 
speculates  on  the  existence  of  a  fundamental  characteristic  of  autonomous  systems  that  may 
underlie  those  hurdles. 


I.  Introduction 

HUMAN  engineered  systems  increasingly  rely  on  automation  to  enhance  performance,  provide  fault  tolerance 
and  allow  the  operator  to  concentrate  on  high-level  decisions  as  opposed  to  low-level  motor  control  tasks. 
These  systems  are  designed  to  be  responsive  to  human-generated  commands  but  at  the  same  time  robust  to 
disturbances  that  may  require  corrections  several  orders  of  magnitude  faster  than  human  response  times.  Advanced 
fighter  aircraft,  for  example,  maneuver  at  the  edge  of  human  sensory-response  capabilities  by  having  autopilots  that 
stabilize  the  aircraft  through  operating  regimes  beyond  the  capabilities  of  direct  human  control.  Artificial  limits  on 
the  aircraft  operational  envelope,  which  are  imposed  on  the  aircraft  performance  to  accommodate  the  limitations  of 
human  physiology  and  sensory-response  capabilities,  are  made  necessary  by  the  critical  role  of  the  human  as  pilot  of 
the  vehicle.  In  effect,  the  human  operated  fighter  aircraft  has  an  outer-loop/inner-loop  flight  control  system  in  which 
the  pilot  provides  the  sensing,  decision  processing  and  command  functions  to  the  inner-loop  autopilot  which,  in  turn, 
stabilizes  the  aircraft  flight  during  maneuvers.  This  time-  or  frequency-based  separation  into  a  relatively  high- 
bandwidth  inner  stabilization  loop  and  a  lower-bandwidth  outer  command  loop  is  a  common  control  system 
architecture  that  requires  the  physical  response  of  the  vehicle  in  its  interactions  with  its  surroundings  to  be  separable 
into  fast  and  slow  dynamics.  While  this  separation  is  usual  and  physically  justified  in  manned  aircraft  and  large 
UAVs,  it  may  not  be  applicable  to  agile  MAS  capable  of  aggressive  maneuvers  in  confined  space  where  the  relative 
kinematics  between  a  MAS  and  other  nearby  objects  may  require  a  response  bandwidth  on  the  same  time  scale  as 
the  MAS  body  rotational  dynamics.  Imposing  the  usual  separation  of  slow  and  fast  dynamics  on  a  MAS  design,  for 
example  by  reducing  its  response  bandwidth  to  mitigate  coupling  with  its  body  dynamics,  will  result  in  stable  but 
sluggish  vehicles  that  have  only  limited  agility. 

Of  course,  the  vision  is  for  MAS  to  achieve  or  even  exceed  the  agility,  performance  and  robustness  of  living 
systems.  We  entertain  notions  of  small  groups  of  MAS  capable  of  flight  through  urban  centers  much  like  flocks  of 
ubiquitous  pigeons  (Figure  1).  These  flocks  of  engineered  vehicles  would  have  the  flight  capabilities  of  flying 
animals  but  will  be  under  the  overall  supervision  of  one  or  more  human  operators.  That  is,  the  MAS  would  require 
significant  autonomous  flight  capabilities  to  negotiate  the  confined  and  high  uncertainty  environment  while 
requiring  positive  human  control  of  the  vehicle  swarm.  Putting  aside  the  various  ethical  issues  of  associated  with  use 
of  autonomous  vehicles  in  human-occupied  environments,  the  challenges  imposed  by  the  multi-scale  dynamics 
inherent  in  this  scenario  are  large.  Human  command  and  decision  processes  may  span  minutes  to  days,  while  the 
dynamics  associated  with  micro-scale  flight  may  evolve  over  milliseconds.  Errors  at  the  human  decision  level  have 
obvious  potential  to  impact  the  overall  system  performance,  from  failure  to  perceive  and  act  upon  a  critical  piece  of 
information  to  issuing  erroneous  commands.  Similarly,  errors  at  the  MAS  level  propagate  upward  to  the  human 
decision  level,  producing  gaps  in  critical  information  or  distorting  the  context  of  otherwise  correct  information.  The 
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emergent  consequences  of  these  different  scales  of  errors  are  impossible  to  predict  with  our  current  system  modeling 
tools.  Thus,  consequences  of  a  MAS’s  erroneous  positive  response  to  a  benign  chemical  signature  may  be 
negligible,  merely  resulting  in  the  vehicle  flying  into  a  nearby  window  and  disrupting  a  peaceful  family  dinner. 
Alternatively  the  consequences  may  be  tragic,  prompting  escalation  of  a  minor  into  a  major  disaster.  Unfortunately, 
our  capabilities  of  engineering  MAS  seem  to  be  outpacing  our  understanding  of  how  to  incorporate  them  into  fault 
resistant  human  decision  networks. 


Figure  1.  Group  of  agile  MAS  entering  an  urban  canyon 


This  paper  takes  the  perspective  that  agile  MAS  with  their  layers  of  human  supervision  represent  complex, 
highly  nonlinear  multi-scale  dynamical  systems.  After  a  brief  discussion  of  some  issues  of  scale  for  such  systems 
and  current  research  investigating  those  issues,  the  paper  will  focus  on  the  idea  of  autonomy  associated  with  multi¬ 
scale  dynamical  systems.  Agile  MAS  currently  exist  only  in  nature  (i.e.,  insects,  birds,  bats).  Consequently,  the 
paper  will  consider  autonomy  in  manmade  MAS  from  a  biological  perspective.  That  is,  it  will  speculate  that 
functional  system  characteristics  associated  with  the  capabilities  of  living  flying  organisms  may  require  levels  of 
response  variation  and  flexibility  that  are  not  associated  with,  and  perhaps  will  not  be  tolerated  in  manmade  critical 
systems.  Although  this  paper  will  not  directly  address  questions  of  ethics  associated  with  the  deployment  of  critical 
autonomous  systems,  it  will  attempt  to  provide  some  insight  into  how  those  important  questions  may  naturally 
emerge  when  any  degree  of  robustness  is  imposed  as  a  design  criterion  for  manmade  agile  autonomous  systems. 

II.  Automatic  Control 

For  present  purposes,  ‘dynamical  systems’  can  be  thought  of  as  systems  which  evolve  through  time. 
Mathematically  their  behavior  can  be  described  by  combinations  of  differential  or  difference  equations.  In  addition 
to  familiar  examples  such  as  objects  in  motion,  fluid  flow  and  heat  flow,  this  definition  also  covers  modern 
‘information  networks’  such  as  human  decision  systems,  the  internet,  networked  communication  systems,  and 
command  &  control  systems. 

Dynamical  systems  which  evolve  over  a  narrow  time  scale  range  can  be  characterized  using  a  rich  body  of 
descriptive  and  computational  mathematics.  An  automobile  operating  on  cruise  control  provides  a  familiar  example. 
A  complete  description  of  all  of  the  dynamics  associated  with  engine,  friction  and  aerodynamic  forces  is  of 
extremely  high  order.  It  involves  time  scales  ranging  from  those  of  the  combustion  processes,  motion  induced 
aerodynamic  turbulence  and  heat  flux  during  severe  braking  to  those  of  the  vehicle  accelerator  response,  certainly 
several  orders  of  magnitude.  While  on  cruise  control  however,  the  vehicle  accelerates  or  decelerates  in  response  to 
road  grade  or  wind  variations  to  maintain  a  relatively  constant  speed.  In  this  cruise  mode,  the  dominant  dynamics 
associated  with  the  vehicle  motion  are  adequately  described  as  a  compact  set  of  3  linear  1st  order  differential 
equations  with  a  time  constant  on  the  order  of  seconds.  Actually  most  familiar  manmade  systems,  whether  home 
heating/cooling  systems,  home  power  generators  or  automatically  piloted  commercial  aircraft  are  designed  to  exhibit 
this  sort  of  relatively  linear,  narrow  bandwidth  response. 
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Some  manmade  systems  do  not  lend  themselves  to  such  a  compact  mathematical  description.  The  most  agile  air 
vehicles  currently  produced,  tactical  air  intercept  homing  missiles,  provide  an  interesting  example  of  a  wide-scale 
dynamical  system.  A  reasonably  minimal  description  of  such  a  missile  during  the  later  phases  of  a  target 
engagement  would  be  of  relatively  high  order  and  highly  nonlinear.  These  dynamics  would  include  the  target 
detection  and  warhead  event,  associated  with  fractions  of  millisecond  time  constants;  the  vehicle  rigid  body 
dynamics,  having  10s  of  millisecond  time  constants;  and  the  intercept  kinematics,  having  100s  of  millisecond  time 
constants.  As  is  typical  for  such  systems,  during  the  design  process  these  different  time  scale  dynamics  are  treated 
separately.  The  warhead  and  target  detection  system  are  designed  separately  from  the  missile  autopilot;  the  autopilot 
is  designed  to  stabilize  the  body  rotational  dynamics  and  to  achieve  the  guidance  system  commanded  accelerations; 
and  the  guidance  system  is  designed  generate  acceleration  commands  to  steer  the  missile  close  to  an  intercept  with 
the  target. 1 

Continuing  with  this  example,  a  missile  developed  to  intercept  high  agility  targets  requires  guidance  systems 
capable  of  high  bandwidth  response  (i.e.,  small  time  constants).  This,  in  turn,  requires  that  the  autopilot  have  a  much 
higher  bandwidth  response,  typically  with  0.2  or  smaller  time  constants  than  that  of  the  guidance  system.  Of  course 
the  airframe  itself  must  be  capable  of  achieving  such  small  response  time  constants.  For  example,  consider  how  fast 
you  can  move  a  long  flexible  fishing  rod  versus  a  short  stiff  one.  Move  the  long  flexible  rod  relatively  slowly  and 
the  rod  tip  will  follow  the  hand  motion.  Move  it  more  quickly  and  the  tip  motion  will  be  out  of  phase  with  the  hand 
motion.  The  short  stiff  rod,  however,  may  be  moved  as  quickly  as  you  can  with  minimal  deflection.  Likewise,  the 
missile  airframe  must  be  stiff  enough  to  produce  the  accelerations  required  to  intercept  the  target.  The  design  of  a 
wide  bandwidth  system  such  this  challenges  the  capabilities  of  the  tools  of  automatic  control.2 

Automatically  controlled  dynamical  systems  have  become  pervasive  in  our  technology -based  society.  From 
climate  control  systems  in  homes  and  buildings  to  automated  aircraft  landing  systems,  the  notion  of  manmade 
systems  responding  to  changing  conditions  on  their  own  has  become  a  familiar  one.  The  idea  of  sensing  some  error 
in  desired  response  and  generating  a  correction  proportional  to  that  error  is  intuitive  and  has  its  origin  in  antiquity.  A 
textbook  example  is  that  of  the  mechanical  governor  of  James  Watt’s  steam  engine.  As  engine  speed 
increases/decreases,  a  spinning  pendulum  device  decreases/increases  steam  to  the  engine  through  a  mechanical 
linkage.  This  allows  the  engine  to  respond  to  varying  loads  with  consistent  performance  without  operator 
intervention,  a  measure  of  system  ‘performance’,  and  prevents  the  engine  from  exceeding  its  cycling  limits  if  the 
load  is  abruptly  changed,  a  measure  of  system  ‘robustness’.  The  rate  at  which  the  speed  of  the  governed  engine  can 
accommodate  load  variations  is  a  measure  of  its  response  bandwidth.  Again,  it  is  intuitive  that  beyond  a  threshold 
rate,  very  rapid  changes  to  the  engine  load  will  exceed  the  response  capabilities  of  the  engine  system.  For  example, 
this  limited  response  may  result  from  a  response  latency  or  time  delay  in  steam  flow  to  increase  in  engine  speed. 
These  characteristics  of  performance,  robustness,  bandwidth,  and  time  delay  sensitivity  comprise  some  of  the 
principle  figures  of  merit  for  any  controlled  dynamical  system.  This  example  also  illustrates  another  key  feature  of 
most  automatically  controlled  dynamical  systems:  that  the  operator  interacts  with  the  system  through  modulation  of 
the  controller.  That  is,  the  engine  speed  is  regulated  by  adjusting  the  governor  rather  than  directly  adjusting  the 
steam  flow.  Thus,  the  human  operator  can  be  thought  of  as  an  ‘outer  loop  controller’,  modifying  the  speed  range  of 
the  engine  based  on  his  own  sensing  processes,  with  the  actual  speed  of  the  engine  regulated  by  the  ‘inner  loop 
controller’,  the  governor/steam  regulator. 

Manmade  automatically  controlled  machines  are  usually  designed  to  provide  a  fairly  linear  response  to 
commands,  however  nonlinear  the  underlying  dynamics  may  be.  In  effect,  the  controller  cancels  the  undesirable 
dynamics  and  replaces  them  with  a  desired  linear  dynamical  response.  Image  stabilization  in  modern  digital  point 
and  shoot  cameras  provides  a  rather  familiar  example  of  this  cancellation  of  dynamics.  Photographer  motion  is 
sensed  and  compensated  through  any  of  various  mechanisms  so  that  much  of  the  motion-induced  blur  is  removed 
from  the  resulting  image.  Any  photographer  motion  beyond  the  bandwidth  of  the  image  stabilization  system  will 
appear  as  image  blur. 

In  the  early  half  of  the  20th  century,  mathematicians  such  as  Norbert  Wiener  and  colleagues  established 
information  and  decision  theory  as  a  foundation  for  development  of  dynamics  and  control  systems  theory  and 
methodology.3  Beginning  with  rudimentary  notions  of  feedback  (e.g.,  the  modulation  of  dynamics  based  on  sensed 
signals  in  Gibb’s  mechanical  governor)  the  latter  half  of  the  20th  century  saw  the  birth  and  maturation  of  theories  of 
linear  multivariable,  linear  robust,  stochastic  linear,  adaptive,  nonlinear,  distributed  parameter  and  cooperative 
control,  to  name  only  a  few  categories.  Based  on  the  mathematics  of  linear  algebra,  set  theory,  real  and  complex 
analysis,  optimization  and  so  on,  the  methodologies  and  tools  available  for  control  system  design  have  become 
essential  to  the  operation  of  many  engineered  systems  from  compact  disc  players  to  commercial  aircraft.4 

These  tools  are  not  without  their  limitations.  To  continue  the  example  of  a  tactical  air  intercept  missile, 
separation  of  the  control  design  into  an  inner  autopilot  stabilization  loop  and  an  outer  guidance  intercept  loop 
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imposes  an  artificial  limitation  on  the  missile  intercept  performance.  With  a  high  order  dynamics  description  of  the 
coupled  intercept  kinematics  and  vehicle  body  dynamics  of  sufficient  fidelity,  a  designer  can  produce  a  very  high- 
bandwidth  controller  that  directly  computes  missile  fin  deflection  commands  from  measurements  of  target 
maneuver.  Unfortunately,  such  a  controller  is  very  brittle  in  the  sense  that  its  response  degrades  or  even  becomes 
unstable  in  the  presence  of  inevitable  errors  in  the  dynamics  model,  unmodeled  time  delay,  unmodeled  high- 
frequency  dynamics,  unpredictable  disturbances,  uncharacterized  sensor  noise,  and  target  maneuver  uncertainties. 
Throughout  the  1990’s,  many  publications  described  various  attempts  to  design  integrated  guidance  and  control 
systems  that  recovered  some  of  the  response  bandwidth  sacrificed  with  inner-outer-loop  designs.  Few  of  these 
approaches  have  been  successful  in  practice  for  reasons  of  high  design  cost  (e.g.,  requiring  high-bandwidth 
actuators,  extensive  tests  to  produce  accurate  dynamics  models,  low  noise  sensors,  low  airframe  manufacturing 
tolerances,  etc.)  and  lack  of  real-world  robustness,  the  latter  due  to  a  combination  of  control  methodology  limitations 
and  the  realities  of  operation  in  stressing  environments.  3 

Much  of  the  research  on  integrated  guidance  and  control,  and  wide-bandwidth  control  in  general,  focuses  on 
increasing  performance  rather  than  robustness.  The  field  of  adaptive  control  instead  focuses  on  increased  robustness, 
or  equivalently  expansion  of  the  performance  regime  of  the  system.  Adaptive  controllers  implicitly  or  explicitly 
learn  the  unmodeled  or  unknown  system  dynamics  and  modify  the  control  signal  to  accommodate  their  impact  on 
the  desired  system  response.  Early  adaptive  control  methods  simply  adjusted  the  controller  gain  to  zero  the  error 
between  desired  and  actual  system  output  responses.  More  recent  adaptive  control  schemes  inject  an  additional 
control  signal  to  preserve  a  system’s  nominal  response  in  the  face  of  uncertainty  or  disturbances.  Some  of  the  most 
interesting  and  useful  advances  in  control  theory  have  occurred  in  adaptive  control  theory  in  the  past  ten-fifteen 
years.  Although  useful  in  process  control  applications  such  as  chemical  processing  and  plants,  the  aerospace 
industry  has  been  slow  to  accept  adaptive  control.  In  the  past  decade  however,  newer  methods  for  design  of  adaptive 
controllers  have  been  applied  to  manned  experimental  aircraft  and  precision  guided  bombs.6,7 

While  manmade  automatic  control  systems  are  common,  manmade  autonomous  systems  are  not.  The  reasons 
for  this  require  some  explanation  of  the  differences  between  the  two  concepts.  Essentially  all  automatic  control 
systems  are  designed  to  produce  desired  response  in  operation  over  rather  narrow  operating  regimes.  This  may  be 
accomplished  through  a  combination  of  limiting  the  response  bandwidth  (i.e.,  essentially  the  closed  loop  systems 
ignores  disturbances,  inputs  and  noise  beyond  its  response  bandwidth)  and  ad  hoc  limits  imposed  on  the  system 
response  (e.g.,  min/max  thermostat  temperatures,  RPM  limiters  on  motor  control  systems,  physical  stops  on 
actuators,  cut-out  switches,  etc.).  These  features  allow  the  automatic  control  system  to  operate  without  human 
intervention  for  long  periods,  delivering  predictable  response  in  the  face  of  outside  disturbances;  the  automobile 
cruise  control  comes  to  mind. 

In  casual  usage,  autonomy  implies  a  level  of  response  robustness  beyond  that  associated  with  more  familiar 
automatic  control  systems,  whether  adaptive  or  not.  For  example,  a  commercial  aircraft  autopilot  allows  steady 
cruise,  climb  or  descent  in  the  presence  of  varying  winds,  but  an  autonomous  landing  system  must  allow  the  aircraft 
to  negotiate  the  far  more  uncertain  wind  conditions  near  the  ground.  Note,  however  that  these  kinds  of  ‘autonomous’ 
systems  are  still  designed  for  very  predictable  response  in  the  presence  of  an  expanded  range  of  uncertain,  but 
reasonably  characterizable  dynamic  disturbance  conditions.  For  present  purposes,  these  kinds  of  systems  will  be 
considered  an  elaboration  of  automatic  control  systems. 

The  concept  of  autonomy  as  used  in  this  paper  is  illustrated  by  examples  from  the  science  fiction  genre  of 
motion  pictures:  the  spacecraft  computer  Hal  in  the  movie  2001:  A  Space  Odyssey,  the  cyber  organisms  in  the 
Terminator  movies,  or  the  robot  Sonny  in  the  movie  I  Robot.  These  fictitious  robots  demonstrate  both  high  levels  of 
response  robustness  and  similarly  high  levels  of  flexibility  in  response.  That  is,  they  vary  their  responses  to  be 
appropriate  to  the  context  of  the  current  and  anticipated  situations  in  ways  that  seem  very  ‘life-like’.  These  are 
systems  that  can  be  given  a  mission  and  allowed  to  respond  as  they  will  during  the  course  of  accomplishing  the 
mission.  This  is  a  very  different  sort  of  behavior  from  that  of  an  automatic  control  system,  whether  adaptive  or  not. 
And  it  is  specifically  this  kind  of  behavior  that  is  implied,  whether  intentionally  or  not,  by  many  descriptions  of 
MAS.8 

To  be  a  bit  more  specific,  this  concept  of  autonomy  implies  flexible  and  context-appropriate  behavioral 
response  in  the  presence  of  real  world  unpredictable  external  events.  Imagine  a  cooperative  group  of  MAS  flying 
through  an  urban  canyon  searching  for  a  particular  vehicle.  These  vehicles  presumably  have  the  sensory  capability 
necessary  to  detect,  identify  and  track  the  truck  as  well  as  to  avoid  collision  with  buildings,  signs,  power  lines  and 
each  other.  Similarly,  they  presumably  have  sufficient  aerodynamic  agility  to  chase  the  truck,  once  it  is  identified, 
through  the  congested  streets  while  maneuvering  to  avoid  collisions  and  to  coordinate  their  efforts.  And  these 
sensory  response  capabilities  are  robust  to  the  high  uncertainties  associated  with  urban  canyons:  wide  variations  in 
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ambient  luminance;  surface  textures  varying  from  concrete  to  painted  or  reflective  surfaces;  as  complex  an  acoustic 
environment  as  may  be  imagined;  wind  gusts  that  may  exceed  the  vehicle  flight  speed;  etc.9 

The  concluding  sections  of  this  paper  suggest  that  the  behavior  of  autonomous  mobile  systems  involves 
variation  and  flexibility  in  response  that  is  significantly  different  from  that  of  manmade  automatic  control  systems, 
whether  adaptive  or  not.  And  the  basis  for  this  point  of  view  begins  with  the  observation  that  the  capabilities 
required  for  this  urban  canyon  MAS  scenario  are  ‘life-like’,  in  the  sense  of  that  imagined  by  writers  of  the  movies 
mentioned  above. 


III.  Agility  and  Autonomy  in  Biological  Flight 

Although  the  notion  of  automatic  operation  was  a  rare  feature  of  human  technology  until  the  last  century,  and 
the  notion  of  autonomy  as  described  in  the  previous  section  is  essentially  absent  from  current  human  engineered 
mobile  systems,  autonomy  is  an  inherent  feature  of  biological  systems  response  at  all  size  and  temporal  scales. 
Somewhat  surprisingly,  this  is  an  underappreciated  fact  given  the  incredible  diversity  of  life  processes  and  life  forms 
on  the  Earth.  In  order  to  see  this,  the  response  of  manmade  automatic  control  systems  needs  to  be  contrasted  with 
that  of  biological  processes. 

Return  to  the  example  of  a  tactical  air  intercept  missile  once  again.  The  missile  autopilot  is  designed  to  reject 
disturbances  and  produce  airframe  acceleration  response  to  guidance  commands  over  a  range  of  altitude  and  velocity 
conditions  that  comprise  the  operational  envelope  for  the  missile.  Within  limits  imposed  by  the  autopilot  design  or 
control  surface  effectiveness,  the  autopilot  will  track  whatever  commands  the  guidance  law  generates  and  do  so  with 
a  certain  error  and  latency.  Analogously,  think  of  using  the  cruise  control  to  modulate  speed  to  accommodate  the 
flow  of  traffic  on  an  interstate  highway.  At  first  glance,  this  would  seem  to  be  similar  to  the  response  associated  with 
a  Peregrine  falcon  steady  flight,  perhaps  with  other  hawks,  during  a  seasonal  migration. 

The  missile  guidance  system,  itself  an  outer-loop  feedback  control  system  for  the  closed-loop  autopilot 
controlled  airframe  dynamics,  estimates  the  relative  motion  of  the  target  with  respect  to  the  missile  and  generates 
acceleration  commands  to  maintain  an  intercept  course  with  the  target.  As  the  target  maneuvers,  the  acceleration 
commands  to  the  autopilot  are  automatically  adjusted  so  that  the  missile  maneuvers  to  accommodate  target  motion. 
As  long  as  the  acceleration  commands  do  not  exceed  the  autopilot  magnitude  limits,  and  the  guidance  system 
bandwidth  is  sufficiently  low  with  respect  to  the  autopilot/airframe  bandwidth,  the  autopilot  will  track  the 
commands  and  the  missile  will  intercept  the  target  within  a  certain  margin  of  error.  Further,  the  missile  guidance 
system  can  be  expected  to  have  been  designed  in  such  a  way  that  it  will  try  to  maintain  an  intercept  course  to  the 
target  in  spite  of  target  attempts  to  flee  or  to  deceive  the  missile  guidance  system.  Again,  this  would  seem  to  be  very 
similar  to  a  Peregrine  falcon’s  predation  attempts  on  a  fleeing  duck  or  grouse. 

The  predicted  performance  of  a  tactical  air  intercept  missile  is  often  characterized  by  mean  and  standard 
deviation  of  the  distance  of  closest  approach  in  Monte  Carlo  simulation  analysis.  Reasonable  random  and  bias 
errors,  various  target  maneuvers,  and  various  engagement  initial  conditions  are  introduced  into  a  high  fidelity 
dynamics  simulation  of  the  intercept  scenario  to  account  for  the  dominant  uncertainties  inherent  real  world 
scenarios.  The  missile  system  designer  tries  to  adjust  the  various  design  parameters  at  his/her  disposal  to  minimize 
expected  miss  distance  (in  the  sense  of  mean  and  variance  as  measured  through  the  Monte  Carlo  analysis)  over  all 
expected  engagement  conditions.  Over  the  lifespan  of  the  missile  type,  the  design  may  be  further  refined  based  on 
analysis  of  flight  tests  or  real  world  engagements.  In  any  event,  the  design  objective  can  be  summed  up  as  producing 
a  desired  nominal  behavior  characterized  by  minimized  mean  and  variance  of  miss  distance  (or  other  suitable  figure 
of  merit).  Further,  the  design  analysis  may  establish  confidence  intervals  associated  with  the  nominal  behavior,  a 
measure  of  system  robustness.  While  the  details  may  differ  greatly  among  other  human  engineered  mobile  automatic 
control  systems,  the  design  objective  of  desired  nominal  behavior  over  some  range  of  conditions  (i.e.,  robust)  seems 
to  be  nearly  universal. 

A  perusal  of  the  animal  behavior  literature  at  first  seems  very  familiar  in  the  context  of  the  discussion  of  the 
preceding  paragraphs.  Biologists  make  observations  of  animal  responses,  whether  to  artificial  stimuli  in  a  laboratory 
experiment  or  to  natural  stimuli  in  the  field,  characterize  the  responses  using  metrics  such  as  mean  and  standard 
deviation,  and  establish  confidence  intervals  using  various  statistical  tests.  Especially  within  many  biology 
experimental  laboratories,  there  seems  to  be  an  almost  engineering  mindset  to  describe  nominal  behavior  and 
characterize  variations  with  respect  to  the  nominal.  This  is  only  good  science!  Experiments  are  carefully  designed  to 
be  replicated  a  sufficient  number  of  times  so  that  statistical  analysis  of  the  results  will  be  valid,  allowing  readers  of 
the  published  results  to  infer  the  relative  merits  of  the  conclusions.  Plots  of  response  often  include  error  statistics 
that  may  suggest,  especially  to  the  non-biologist,  an  almost  engineered  nominal  response.  This  seems  to  be  true 
whether  the  experiments  involve  study  of  behavioral  intraspecific  interactions  among  animals;  study  of  the 
neurobiology  of  animal  sensory  systems;  reconstruction  of  flight  mechanics  and  aerodynamics  of  animals  flying  in 
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wind  tunnels;  response  of  physiological  processes  to  perturbations;  study  of  biochemistry  associated  with  metabolic 
processes;  study  of  cellular  mass  and  energy  transport  mechanisms;  or  study  of  protein  transcription  or  nucleic  acid 
replication. 

The  complexity  of  biological  system  responses  at  all  scales  requires  this  kind  of  approach.  In  order  for  an 
experiment  or  study  to  be  capable  of  being  replicated,  which  is  an  obvious  requirement  of  a  credible  scientific 
endeavor,  experimental  conditions  must  be  controlled,  or  the  observational  study  scope  narrowly  defined,  so  that 
response  and  stimuli  may  be  reliably  correlated.  While  this  is  true  with  the  study  of  any  complex  phenomenon,  it 
seems  to  be  inherent  in  essentially  every  biological  study. 

Unfortunately,  and  this  is  speculation,  it  seems  that  these  tendencies  to  nominal  behaviors  exist  primarily  over 
conditions  associated  with  the  specific  study.  An  impression  one  gathers  from  discussions  with  biologists  or  from 
the  published  literature  is  that  variation  in  response  among  different  individuals  within  the  same  species,  or  even 
among  subsequent  trials  with  the  same  individual  test  subject,  is  large.  Furthermore,  this  response  variation  may  be 
correlated  with  very  subtle  differences  among  conditions  in  subsequent  experimental  setups;  differences  that  would 
seem  to  be  irrelevant  in  the  context  of  the  study. 

Similar  impressions  arise  almost  immediately  from  reading  studies  of  animal  social  behavior.  In  vertebrate 
observational  studies,  behavioral  differences  among  individuals  in  a  social  group  often  allow  researchers  to 
distinguish  individuals  at  a  glance.  Read  Jane  Goodall’s  accounts  of  the  Gombe  Reserve  chimpanzees  or  George 
Schaller’s  studies  of  lion  prides  on  the  Serengeti.1011  But  this  also  seems  to  hold  to  a  significant  degree  for  fish 
schools,  passerine  bird  flocks,  bat  colonies,  in  fact  for  any  vertebrate  group  you  can  think  of.  Not  only  do  animals 
assume  different  roles  within  structured  social  groups,  but  the  behaviors  of  different  animals  playing  the  same  roles 
differ  in  significant  ways.  The  layperson  impression  of  homogeneity  in  response  for  these  organisms  may  only  be 
due  to  limited  resolution  of  the  observation  (e.g.,  sheep,  and  their  shepherds,  recognize  other  sheep!). 

Animals  typically  associated  with  more  stereotyped  behavior  also  seem  to  exhibit  large  individual  variations  in 
response.  It  has  long  been  known  that  honeybees  change  behavioral  roles  within  a  colony  as  they  age.  Stress, 
variations  in  food  supply,  weather  conditions  and  other  external  conditions  can  modify  the  timing  of  these 
maturation  effects.  And  as  with  vertebrates,  honeybees  show  individual  behavioral  differences  even  within  the  same 
age  class.  Again,  discussion  with  insect  biology  experimentalists  leaves  one  with  the  impression  that  insect  behavior 
is  far  from  being  predictable  to  the  degree  that  one  associates  with  well-engineered  mechanical  systems. 

IV.  Implications  for  Manmade  Autonomous  Systems 

To  set  the  stage  for  the  closing  discussion,  consider  the  following  behavioral  study  thought  experiment.  Choose 
at  random  100  missiles  of  a  given  model  and  fire  them  one  at  a  time  against  targets  of  a  given  type  under  a  range  of 
reasonable  engagement  conditions.  Chances  are  very  good  that  the  distributions  of  miss  distances,  times  to  intercept, 
trajectories,  etc.,  would  be  consistent  with,  though  not  identical  to  those  obtained  from  a  Monte  Carlo  simulation 
study  of  the  same  missile  model  evaluated  over  a  similar  range  of  scenarios.  That  this  is  a  reasonable  expectation 
emerges  from  two  related  phenomena:  the  dynamics  models  in  the  simulation  have  been  refined  to  yield  a  high 
fidelity  representation  of  the  actual  scenarios;  and  the  missiles  have  been  designed  to  yield  consistent,  reproducible 
desired  nominal  behavior.  This  kind  of  predictable  behavior  is  often  termed  ‘mechanistic’,  even  when  ascribed  to 
human  behavior  such  as  that  of  a  choreographed  dance  performance. 

Now  perform  an  analogous  experiment  with  a  falconer  releasing  100  trained  Peregrine  falcons  of  similar  age 
and  training  experience  one  at  a  time  against  a  sequence  of  fleeing  grouse  (recall  it  is  a  thought  experiment).  On  any 
given  day,  the  flight  performance  of  a  trained  bird-of-prey  such  as  a  falcon  may  be  influenced  by  many  factors 
including  how  recently  it  has  eaten,  its  molt  condition,  the  season,  its  general  state  of  health,  etc.  Although  these  are 
trained  animals,  one  would  not  be  surprised  to  find  the  performance  variation  to  be  quite  large.  That  is,  the  spreads 
in  the  distributions  of  number  of  passes  to  capture,  times  to  capture,  paths  flown  during  the  pursuit,  etc.  would  be 
large  when  compared  with  related  figures  of  merit  in  the  missile  experiment.  We  expect  this  since  animals,  after  all, 
are  animals  and  their  behavior  is  rarely  ‘mechanistic’  in  the  sense  of  being  highly  predictable  over  long  time  scales. 

Finally  perform  a  similar  experiment  with  release  of  100  wild,  untrained  Peregrine  falcons,  again  of  similar 
ages,  one  at  a  time  against  a  sequence  of  fleeing  grouse.  We  naturally  expect  the  performance  variation  to  be  even 
larger  than  with  the  trained  animals.  The  object  of  the  training,  after  all,  is  to  produce  repeatable,  predictable  desired 
nominal  behavior. 

What  does  this  have  to  do  with  design  of  agile  MAS?  Even  if  these  observations  with  respect  to  biological 
autonomous  response  are  valid  as  conjectured,  correlation  of  response  flexibility  with  autonomy  does  not  imply 
causation.  Biological  systems  that  emerge  through  the  interplay  of  the  complex  processes  of  evolution  may  exhibit 
response  variation  as  a  byproduct  of  the  variations  necessary  for  powering  evolution.  Certainly,  species  with  highly 
specialized  behavior  are  more  seriously  affected  by  environmental  change  than  those  with  more  varied  behavioral 
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repertoires.  Many  of  the  animal  extinctions  of  the  past  few  centuries  involve  such  ecological  or  behavioral 
specialists.  Hence  animals  capable  of  tolerating  large  ecological  perturbations  would  naturally  be  supposed  to  have 
behavioral  repertoires,  individually  and/or  collectively,  that  allow  adaptation  to  the  environmental  fluctuations. 

Although  correlation  certainly  does  not  imply  causation,  an  argument  can  be  developed  that  suggests  behavioral 
response  flexibility  is,  in  fact,  naturally  and  intrinsically  associated  with  autonomous  behavior.  Further,  that  MAS 
capable  of  interacting  with  their  surroundings  in  the  complex  ways  envisioned  by  technologists  will,  at  the  very 
least,  exhibit  the  variations  in  response  associated  with  highly  trained  animals  (or  human  groups!)  and  will  not 
exhibit  the  relative  high  performance  predictability  currently  associated  with  automated  machines.  Anyone  who  has 
walked  a  normally  well  behaved  male  dog  in  the  vicinity  of  a  female  dog  in  season  will  appreciate  the  difference. 

An  outline  of  such  an  argument  might  begin  with  consideration  of  autonomous  systems  that  exhibit  context- 
appropriate  behavioral  responses  to  essentially  unpredictable  events.  One  might  then  make  the  following  assertions, 
each  of  which  is  disprovable,  at  least  in  principle: 

•  Real  world  complex  environments,  whether  natural  or  manmade,  generate  unpredictable 
events  over  behaviorally  relevant  time  and  spatial  scales 

•  It  is  impossible  to  model  the  important  dynamics  real  world  complex  environments,  whether 
natural  or  manmade,  at  sufficient  levels  of  fidelity  required  to  a  priori  define  context- 
appropriate  responses 

•  The  degree  of  flexibility  associated  with  a  behavioral  repertoire,  independent  of  the  size  of  a 
behavioral  repertoire,  determines  the  range  of  context-appropriate  responses  available 

•  The  range  of  context-appropriate  responses  available  determines  the  range  of  unpredictable 
events  that  can  be  accommodated 

A  reasonable  inference  from  these  assertions  is  that  environmental  complexity  drives  a  requirement  for  behavioral 
response  flexibility  and  makes  it  a  necessary  attribute  for  any  system  capable  of  accommodating  uncertainties 
associated  with  a  real  world  complex  environment.  This,  of  course,  falls  far  short  of  a  proof  that  behavioral  response 
flexibility  is  necessarily  associated  with  autonomous  systems,  but  it  motivates  consideration  of  the  possibility  that 
such  might  be  the  case.  The  possibility  merits  further  investigation. 

If  the  preceding  discussion  has  merit,  the  natural  question  emerges  of  whether  human  society  is  prepared  to 
accept  that  MAS  may  operate  more  like  trained  animals  than  more  familiar  automated  mechanical  devices.  We  are 
reasonably  comfortable  with  the  knowledge  that  even  highly  domesticated  animals  occasionally  exhibit  undesirable 
behavior.  Whether  we  can  become  comfortable  with  the  potential  for  similar  behavior  from  MAS  will  be  an  open 
question  until  such  systems  arrive. 
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